I’ve recently been faced with the prospect of supporting a number of VBScripts in a product written almost exclusively in C#. So rather than continue supporting VBScript I wondered how difficult it would be to convert the VBScript to .NET?
So how to convert the code? The obvious answer is to either write a parser or use a commercial converter or parser language. Well I couldn’t find anything (at least not cheaply) that would automatically convert the code. I also rejected the idea of writing a parser since I know from experience that the parser is easy, it’s implementing all the language rules that are tricky. Now that sounds like a good reason but in reality I knew that the VBScripts I’d be supporting are all roughly the same and would only represent a small subset of the available VBScript lexicon so I didn’t really want to spend the time implementing a fully blown solutions, what I needed was a fairly quick and simple answer. I decided that it was about time I learnt regular expressions and it seemed to me that this would provide me with a conversion mechanism.
The next problem was which .net language to choose. I spend most my day using C# but business developers don’t tend to like that – again another excuse. The real reason for choosing VB.NET is that it is *very* forgiving and shares a number of functions and keywords with VBScript so should make the conversion easier. So the converter was going to go from VBScript to VB.Net and use RegEx’s to do the donkey work.
So using the Regex component of .net I set about producing a converter.
Tasks, including the pattern used;
NB. As you’ll see I’m a RegEx newbie so the patterns used tended to change as I found new ways of doing things, but hey they work.
- Ensure Option Explicit is removed – I don’t need it so I’m dropping it from the VBScript – (?i)\s*Option Explicit
- VBScript functions and Subs – Pesky devils, the problem here is that function need a "As object" value and all routines really should (in my case) have their arguments prefixed with ByRef to be compatible with the VBScript.
Find those routines – (?i)(Function|Sub)\\s*[a-z_][a-z_0-9]*\\s*[(](([\\w\\d]*)|(\\s*,\\s*[\\w\\d]*))*[)]
Remember the name of the routine – (?i)(?<=((Function|Sub)\s*))\w*
Get the arguments – (\((([\w\d]*)|(,\s*[\w\d]*))*)|(,\s*(([\w\d]*)|(,\s*[\w\d]*))*) - So we’ve converted the routine declaration, but VBScript doesn’t (typically) use brackets when calling a routine and VB.NET requires them, so "lucky" we remembered the names of the routines in step 2.
Find any caller to a function that isn’t already using brackets and isn’t simply the assignment of the function result – string.Format(@"(?i)(?<!(Function|Sub)\s*){0}(?!(\s*=)|(\s*\())", routineName) - Adding Namespaces – this one caught me out at first, it was fairly obvious that I needed to import "system" but it took a couple of scratched heads to include "Microsoft.VisualBasic", seems obvious now!
- The next problem was general differences in keywords and types, etc. The most common one was replacing "now()" with "DateTime.Now". This proved an interesting problem since a number of scripts contained code such as "now()-1", so converting that to "DateTime.Now-1" didn’t cut it since .net can’t correctly cast the integer. So (for some reason) I chose to replace those with DateSerial(0,0,x) – (?i)(?<=DateTime\.Now\s*)(-|+)\d"
The other common problem was VBScript variables called "Return", so they needed replacing, again nothing fancy just guessed at a unique name, NB return isn’t a valid exit in VBScript –
replace…(?i)\bReturn\b", "ReturnVarX " - "Set" and "Let" – so Microsoft has finally killed these off, who knew? 😉 – (?i)(?<=\s)set\s*
- All done!
So there you go, as long as you’re not guaranteeing 100% VBScript conversion and, like me, have a finite (if large) number of scripts to convert, you can probably convert them all with the minimum of fuss and a few regular expression.
My development consist of :First 1) Developing a Large Main System which now is completed, hence producing a Large amount of database outputs.
Then, secondly, 2) Taking these outputs and examine them briefly, to ensure that it doesn’t clash with the existing database. So, this part took me seriously with all the *fuss* that I’ve to make, anyway I have done it in vbScript, but it couldn’t run as it gives me “out of memory” error resulted from excessive use of array. And this make me decided to migrate the script to C, at first glance.
then, next 3) Moving on to C. At first, few of the known “stubs” or sub functions seems to be working correctly, but, after re-building it to a larger block of subs, it begins to become unstable. This has to do with all the pointers and all those stuff which again, consists of more limitation that don’t comply with my script.
then, at last, 4) I’ve found on your blog, where it’s far more easier to use a RegEx to convert/search any kind of text for any kind of outputs. So I want to ask you, beside from keep searching the tools to keep migrating the script from one language to another, why not just use the RegEx as you did to look for any of my known output produced earlier — it consists of a large “SQL” text that is embeded in the program, which i’m going to analyze for what output it does to the pre-existing database.
So, I need to copy parts of your RegEx to try it out., but at first I need to learn a bit about RegEx, as i haven’t used it before. I need to know where can I start and is there any compiler/tools for me to practicing this?
thanks.
I mainly used a couple of websites and also created a quick test app. However, I have heard a lot of good things about regexbuddy. Although your requirements seem to be about SQL scripts. There are tools to do that too