Saturday, May 26, 2012

SCT - A custom language code translator

I have been hard at work working on a translator. At last, it has been complete. The translator is a very lightweight program that analyzes a file and a dictionary to translate them between user specified languages. This means that every user on the planet can have their own custom programming language and still be able to convert it into C++. Here is how it works:

 SCT - The translator
Specific Code Translator, SCT for short, analyzes two files. One is called a langdefines. The other is the code written in the language specified in the langdefines file. Here is how it works:

Every langdefines file has this syntax: old_new. So, if we wanted to specify the keyword "void" being "nothing" in a new language the syntax would be: void_nothing in the langdefines. Commenting and newlines are not yet supported in the langdefines file, but support for both are coming in future updates.Whitespace is readable.

Then, the translator will pick apart the langdefines and parse each individual side into two separate arrays. After setup, translation happens.

The translator will read a code file specified by the user as well as reading in the output file and a magichar. More about those two later, though.

The translator will replace new strings with old strings, thus converting it back into the language on the left. However, this is very specific, which is why Specific is the first word of the program. What may seem like design flaws are actually features to enable maximum usage of the translator.

The translator will replace EVERY string matching in the arrays. To avoid things like cout << "torch" from becoming cout << "t||rch" you have to use the sensitive strings feature. Sensitive strings in the code are not parsed, which gives them safety. Output, comments, variable names, etc. should all go in sensitive strings. You use a magichar to specify a sensitive string. The sensitive string goes in between two magichars specified at run time, like so: cout << "$torch$"
Neither torch nor the "$"s will appear in the code.

Advantages
You can use this program in a number of ways, including fixing massive spelling mistakes in documents, creating a custom syntax for a programming language, or for just showing off to your friends your new language. The way SCT works allows for "inline" coding, meaning that if you were to use it to add some pizazz to C++'s syntax you could still use C++ if you wanted to.

Future Plans
There are a few more plans in store for the program. I want to make reverse translation easier than messing with the code directly and I want to improve the syntax of the langdefines file.

The necessary files are all at my github. My custom langdefines is for a language I call NRSL. Some people that I have asked have Really hated the syntax, so go ahead and judge for yourself.

No comments:

Post a Comment