ifni.co | CONCEPTS | REFERENCES | ABOUT |

Programmers have different ideas what is the best way to program and which language to use.
Different personalities, different preferences I think.
For me I have always liked more weak type languages even that my introduction to programming was in Pascal. I always liked to approach a problem from top->down.
What I mean by this is to start with more abstract rough solution and drill down the details as I go.
The drawback of programing in this way is that it is somewhat slower initially.
But you have better view of how to conceptually to optimize the algorithm of the application.
Most of the time my pre-alpha version is one-off script, which at alpha stage becomes OO-skeleton app ... and so on

Normally weak typed languages are better suited for this type of tasks. Once you have some working prototype you can "fish"-out the details.
On the other hand in language where the type and other minute details can be specified then the compiler has more ways to optimize the code for speed and
memory.
So I was wondering can I use strongly typed language like I use weak typed one, so that I have the prototyping-speed of scripting language
with the ability over time to optimize for speed if I wanted to. That is how I decided to implement some of the idioms I normally use
when I write Perl in C++.

Something like this would serve several purposes.

- First it could serve as a stepping stone for Perl programmers wishing to learn C++, which is always good.

- Second allow C++ some of the prototyping speed of Perl.

- Third it would be fun.

Probably the most obvious obstacle on our way that we would wanna get rid of is the need to micromanage variable type declaration, casting, conversion ...., we can do this by implementing a perl-lke Scalar type i.e. a type that can hold number or string or boolean and also behave as expected when you use operation like

Let's start with simple example :

#include "perl_like.h" using namespace pl; int main() { Scalar s1 = "The quick brown fox..."; cout << "First it is a string: " << s1 << nl; s1 = 15; cout << " and then a number : "<< s1 << nl; return 0; }

What we did here is first to

Our next task is to implement the

See we did not declare new variable of any of the number types, we just assigned the number to the same variable. Let's compile and execute to see will the magic work :

> g++ scalar_example.cpp -o example > ./example First it is a string: The quick brown fox... and then a number : 15Cool, it seems to be working. Let's add some more stuff to the mix :

00: #include "perl_like.h" 01: using namespace pl; 02: 03: int main() { 04: 05: scalar s1 = "The quick brown fox..."; 06: 07: cout << "First it is a string: " << s1 << nl; 08: s1 = 15; 09: cout << " and then a number : "<< s1 << nl; 10: 11: pnl; 12: scalar s2(10); 13: cout << "Then we sum two scalars : s1(15) + s2(10) = " << s1 + s2 << nl; 14: 15: //string concatenation is done using |, instead of + 16: scalar $s3, $s4; 17: $s3 = "conca"; $s4 = "tenated"; 18: cout << "Lets concatenate strings : "; 19: scalar $s5 = $s3 | $s4; 20: cout << $s5 << nl; 21: 22: //regex match 23: if ($s5 ^= "tena") { 24: cout << "match regular expression, succesfull" << nl; 25: } 26: 27: //refer to how it behaves in Perl, if in doubt 28: scalar $s6 = $s5 * s1; 29: cout << "Multiplying a string * number yelds : " << $s6 << nl; 30: 31: scalar $s7 = 0; 32: if ($s7) {} 33: else { cout << "perl idiom if($s) : 0 => false" << nl; } 34: 35: $s7 = 1; 36: if ($s7) cout << "perl idiom if($s) : not 0 => true" << nl; 37: 38: //you can always dump a variable to see the internal 39: cout << "Here is how dumping a variable works $s5.dump()" << nl; 40: $s5.dump(); 41: 42: return 0; 43: }Here is how the output looks like :

> g++ scalar_example.cpp -o example > ./example First it is a string: The quick brown fox... and then a number : 15 Then we sum two scalars : s1(15) + s2(10) = 25 Lets concatenate strings : concatenated match regular expression, succesfull Multiplying a string * number yelds : 0 perl idiom if($s) : 0 => false perl idiom if($s) : not 0 => true Here is how dumping a variable works $s5.dump() .type:2, .num:-1.43217e-05, .str:concatenated

Line 12: We declare another scalar variable

Line 13: Then we sum the two scalars

Line 16: What follows is something important to know for Scalar, which at first may seem contra-intuitive, but make perfect sense if you think about it. Because

(Because the way we are implementing Scalar i.e. overloading almost all the operations, we can't use ".", which is used in perl to do concatenation). Btw, if it makes sense to do concatenation with

Some of the most observant of you probably spotted something weird for C++. When I declare scalar variables I prepend the name with dollar sign

Line 23: Here is another Perl idiom comparing a string against regular expression. If you have to do regex'es in C/C++ you have to do alot of preparation work. I decided to steal yet another operator for this goal i.e.

Line 28: The next thing is not something important, but I included it here just for demonstration purposes. Multiplying (string * number) returns 0. If you want to predict what the result will be of Scalar operation when two operands are different internal type, just use ZERO in the place of the string and then do the numerical operation.

Line 32,36 : The following two examples are yet another Perl idiom, namely checking a variable in an

Line 40: Last, but not least I provided you with a

typedef double number; enum scalar_subtype { NUMBER = 1, STRING = 2 }; ..... struct { number n; string s; } value; scalar_subtype sub_type;

As you can see in addition to the data itself we have one more attribute

Any operation that we are about to do on the Scalar has to first consult the internal representation, before executing it. The next core part of the implementation are the operations themselves. Here we have a little bitty problem on our hands... there is 4 major operation in which numbers can participate i.e. addition, subtraction, multiplication and division, then we have 4 shortcut operations

So if we do rough calculations this mean we have to implement

To lower the number of those possibilities we will be clever in three ways.

- First we will implement the
**+=**operation, for Scalar-Scalar. This would allow us to "skip" the implementation of "+", because it is almost the same thing, as we will see in a minute.

- Second we will use templates to "shortcut" Scalar-String and Scalar-Number operation to Scalar-Scalar.

- And third we will provide a capability for conversion from C++ intrinsic types to Scalar, so we can make those Scalar-shorcuts work. (that is because if you know how to convert a specific type to Scalar-Scalar, you can always convert and then apply Scalar-Scalar operation).

Before we start with the gritty details let me mention several utility methods that we will use all over the place. The attribute accessors an such.

We will use two level access to the Scalar storage, first level are protected methods and should be used only by the class constructors and from the second level methods. This way we protect ourself from recursively calling the storage accessors and make it easy in the future to implement Scalar with other ways of storing the data.

**set_num(), get_num()****set_str(), get_str()****set_type(), get_type()**

- getters:
**num(), str()** - setters:
**num(x), str(x)**- set the internal value and type - checkers:
**is_num(), is_str()**- return the current internal type - converters:
**to_num(), to_str()**- whatever the internal type is get us back the thing we want.

There is at least two other possible implementations - as a string (you just convert it back and forth on every usage as needed) OR C++ union, the new standard seems to allow unions of intrinsic types and strings (I didn't implemented it this way, because I wanted Scalar to be as backward compatible as possible). On the pure string implementation I have a semi-working scenario which I can publish with some useful remarks if I have time some day, btw to take on yet another "tangent", if it were not for "restrictiveness" of C++ i.e. disallowing implicit conversion of strings via

... where we were, ... yep now that we know how to access and edit the scalar internal data lets implement the class constructors :

Scalar() { set_num(0); }; //copy constructor : Scalar $s = 55 Scalar(const Scalar& c) { c.get_type() == NUMBER ? set_num(c.get_num()) : set_str(c.get_str()); } Scalar(const string& x) { set_str(x); }; Scalar(const char* x) { string s = x;//convert to string first set_str(s); }; Scalar(const number& n) { set_num(n); };//$s = 55 Scalar(const int& n) { set_num(n); };//Scalar $s = 0; zero-ambiugity assignment conv..... starting with the copy constructor, which is normally called when you pass argument by value or in assignments... Then we need the empty constructor and last but not least constructors to create scalar from intrinsic C++ types, which we call using the function-like-syntax i.e.

Next we have cast operators, so that the compiler can handle those automatically, instead of us doing explicit casting.

operator number() const { return to_num(); } operator float() const { return to_num(); } operator int() const { return to_num(); } operator char() const { return to_num(); } operator string() const { return to_str(); } //mimic boolean operator bool() const { return is_num() ? num() != 0 : str2num(str()).first; }Another important operator is assignment :

Scalar& operator = (const Scalar& rhs) { if (this == &rhs) return *this;//self-assignment no,no..! rhs.is_num() ? num(rhs.num()) : str(rhs.str()); return *this; } template<class T> Scalar& operator = (const T& rhs) { return *this = Scalar(rhs); }The variable

The first thing we want to do is check that this is not a self-assignment. Next based on the rhs internal type we set the value in the lhs Scalar, and finally we return reference. (

The second method declaration in this snippet is to handle the cases where our right hand operand is not a Scalar. We use template to catch all other types. What this catch-all-types operator will do for us is to first create a new Scalar (remember: we already implemented the constructors to create Scalar from intrinsic types) on the fly and then call our Scalar = Scalar assignment. Simple, eh!

You should read this template declaration thing as follows : "Match

OK. The accessors, constructors, cast'ers, assignment and the utility methods give us everything necessary, so we can finally implement the Scalar overloaded operators.

We will start with the shortcut-addition operator:

Scalar& operator += (const Scalar& rhs) { if (is_num() && rhs.is_num()) { num(num() + rhs.num()); return *this; } if (is_num() && rhs.is_str()) { number n1 = rhs.to_num(); num( n1 ? num() + n1 : num() + 0); return *this; } if (is_str() && rhs.is_num()) { number n1 = to_num(); num( n1 ? n1 + rhs.num() : 0 + rhs.num()); return *this; } if (is_str() && rhs.is_str()) { number n1 = to_num(); number n2 = rhs.to_num(); //logical XOR : first case str+str OR num+num, else .... if (!n1 != !n2) { n1 && !n2 ? num(n1 + 0) : num(0 + n1); } else { n1 && n2 ? num(n1 + n2) : num(n1 + n2); } } return *this; } template<class T> Scalar operator + (const T& rhs) { //first make a copy then do shortcut-summation Scalar $rv = *this; $rv += Scalar(rhs); return $rv; }

Our first argument is always the one on which we operate on, the second one is our right hand operand i.e. this += rhs. Where *this points to the current object. In the case above we don't have to explicitly use this to access lhs.method(), we just call the method(). For the right-hand operand we have to specify explicitly i.e. rhs.method().

Our first order of business is to check the internal type of both operands and based on that act accordingly.

As I already mentioned, but it does not hurt repeating, there is two methods that will help us with that :

The code is almost self explanatory, but I will go ahead and explain it. In principle we need to to cover 4 cases i.e.

Don't be scared of the

If you look again carefully all modification were done on the left-operand i.e. *this.

bool operator == (const Scalar& rhs) const { if (is_num() && rhs.is_num()) return num() == rhs.num(); if (is_num() && rhs.is_str()) return num() == rhs.to_num(); if (is_str() && rhs.is_num()) return to_num() == rhs.num(); if (is_str() && rhs.is_str()) return str() == rhs.str(); return false; } template<class T> bool operator == (const T& rhs) const { return *this == Scalar(rhs); }We can see that the comparison operator looks alot like the summation operator. Again we have case for every combination of sub types of the lhs and rhs operand. One difference though is that instead of returning the Scalar we return boolean value this time.

Then again we use template to handle the non-Scalar cases. Inside this method we create Scalar from the intrinsic number or string type and then use the Scalar-Scalar implementation to handle the rest.

I promised earlier to revisit again the the implementation of the Perl idiom of pretending that the Scalar is a boolean and just do simple if(). Here is one example.(you can see more in the test script).

Scalar $s = "0"; if ($s) { cout << "$s is true" << endl; } else { cout << "$s is false" << endl; }Will print "$s is false" as you may expect. One subtle thing to see for non-Perl programmers is that the zero we are using is in fact a string. The logic is the following :

- "0" is interpreted as false
- 0 => false
- "0dsad" => false (conversion of string to number yield 0 i.e. false)
- "sadoa" => true , any regular string is treated as true
- "4567sdawq" => true, any string convertible to non-zero number is true.
- 89 => true, any number except zero is treated as true.
- "0E0" => true, special case zero but true

#define pBool std::pair<bool,number> pBool str2num(const string& str) { istringstream is(str); pBool rv(false,0); if (str == "0E0") {//Zero but true rv = std::make_pair(true,0); } else { is >> rv.second;//convert //logical XOR : !fail != !0? if ( !is.fail() != (rv.second == 0) ) rv.first = true; }; return rv; };The utility function we use does not just convert from string to number, but also returns a boolean to tell the receiving end if the conversion was successful. We do that by returning a pair of data. We can access both elements of the pair like this num2str(str).first and num2str(str).second. The other weird thing you can see is this special handling of the string "0E0", which translated means "Zero but true" i.e. if we use the result of conversion as number it will be interpreted as

* stream operator (friend) * increments : ++$i vs $i++

How is that ?

If you look at Scalar.h, you may see that

In fact when I first started implementing Scalar I wrongly started with the addition operation, rather than with the shortcut, and as I was doing it I saw I can describe them in micro-units of macros. If you glance over the code you will see there is tree basic micro-operation involved :

Here is a glimpse of how it looked like :

.... //shortcut for string+string and string+number logic #define NumNum(op,l,r) return scalar(l op r) #define StrStr(op,l,r) { mn_cn12(l,r) ; return scalar(logical_xor(n1,n2) ? n1 op n2 : (n1 && !n2 ? n1 op 0 : 0 op n2 ) ); } //Ex: convert lhs to num. If it can be converted then return result of num-operation, else convert rhs to string and... #define StrNum(op,l,r) { mn_cn1(l); return scalar(n1 ? n1 op r : 0 op r ); } #define NumStr(op,l,r) { mn_cn1(r); return scalar(n1 ? l op n1 : l op 0 ); } ......

Of course using macros is always looking for troubles, but anyway it was a good exercise.

Templates are of no use here, because they handle "type-variability". In this case the variable-thing is the operation (+,-,*,/) and as we know operation is the method name, not the method argument types.

As far as I know only functional, logical languages and Perl (via AUTOLOAD) allows you to play with the "functor"-name. Strongly typed languages like C++, normally would discourage such a freedom, because it can end badly if not used carefully.

I personally have been a witness what a monstrous abuse Perl AUTOLOAD could become in the hand of wrong people ;). (I'm talking about the GOD-object anti-pattern, look it up on wikipedia).

But in the hands of experienced programmers it could work miracles...

One more overcomplication/oversimplification we could do, if we wanted to ;), is to implement all the operations in one method and then call it from everywhere.

Simple use templates to describe all possible variations of types and call one all-in-one method with the operands plus their types as arguments. This way in the receiving end we will get all the required information so that we can logically split the operation handling via if-then-elses and do the correct typecasting (that's because we have to pass all variables as void-pointers).

Here is our example from my old test.cpp :

const string INT_TYPEID = typeid(int).name(); const string NUM_TYPEID = typeid(number).name(); const string STR_TYPEID = typeid(string).name(); ... void test(string op, string n1_type, void* n1, string n2_type, void* n2, number got, number expected, string msg) { string n1_str = n1_type == NUM_TYPEID ? num2str(*(number*) n1) : "'" + *(string*) n1 + "'"; string n2_str = n2_type == NUM_TYPEID ? num2str(*(number*) n2) : "'" + *(string*) n2 + "'"; string details = msg + " : " + n1_str + op + n2_str + " = " + num2str(expected); myis(got,expected,details); }

The idea seems simple, but the implementation is ugly.

Do you see how we pass n1 and n2 as void* and then cast them when needed. One thing you may find hard to figure out in

What this translates to is, first typecast to

I thought test_scalar.cpp to be good idea to illustrate this approach, rather than implement it in Scalar.h.

> g++ test_scalar.cpp -ltap++ -o test >./test .....

There is a little problem libtab++ (it may have been fixed already), namely that the is() test does not print the specified message in some cases, you can resolve the problem by editing tap++.h

> 142c142,143 --- > bool ret = ok(2 * fabs(left - right) / (fabs(left) + fabs(right)) < epsilon); --- > bool ret = 2 * fabs(left - right) / (fabs(left) + fabs(right)) < epsilon; > ok(ret, message);

+> Scalar.zip

And by the by use integer in for loops, not Scalar ...

I'm currently experimenting with implementing Hash, again my idea is to look and feel close to Perl behavior if possible.

Of course it wont be 100% possible because we are working in the framework of C++, still... I already have some working implementation, here hash.h, I just have to write an article like the one you just red and create some example script.

Next comes HoH i.e. Hash of hash, again I have some early working variant of it...but it is even more tricky to make it more perl-like.

HOME | TOC | CONCEPTS | REFERENCES | ABOUT |