Integrating RegexMagic Using COM Automation

If you are a software developer, and some of your products support regular expressions, make your customers happy and provide tight integration with RegexMagic. That way, they can use RegexMagic to generate regular expressions to be used with your software, without having to manually copy and paste regexes from RegexMagic to your software.

RegexMagic provides a COM Automation interface, which you can easily import and call from any development tool or language that supports COM. The interface enables you to launch RegexMagic, receive the final regular expression, and even store the RegexMagic formula allowing the regular expression to be edited later with RegexMagic. It is a single instance interface, which means that each application has its own private instance of RegexMagic.

Take a look at PowerGREP and EditPad Pro to see how convenient they make it to create a regular expression with RegexMagic.

Demo Applications Implementing RegexMagic’s COM Automation

At http://download.jgsoft.com/magic/RegexMagic2Clients.zip you can download two sample applications that communicate with RegexMagic through its COM Automation interface. One is written in Delphi 2009, and the other is written in C# (Visual Studio 2010).

Importing RegexMagic’s Type Library

RegexMagic’s installer automatically registers RegexMagic’s automation interface with Windows. To automate RegexMagic via COM, you need to import its type library. It is stored in RegexMagic2.exe, which is installed under C:\Program Files\Just Great Software\RegexMagic2 by default.

In Delphi, select Component|Import Component from the menu. Select “Import a Type Library” and click Next. Select “RegexMagic API” version “2.0” from RegexMagic2.exe and click Next. Choose a palette page and a directory, make sure Generate Component Wrappers is ticked, and click Next. Install into a new or existing package as you prefer. Two new component called TRegexMagicIntf and TRegexMagicIntf2 appear on the component palette. These components implement the methods and events you can use to communicate with RegexMagic. Drop TRegexMagicIntf2 on a form or data module. Set ConnectKind to ckNewInstance, and make sure AutoConnect is False. This ensures your application has its own instance of RegexMagic, and that RegexMagic only appears when the user actually wants to edit a regular expression. Call the component’s Connect() method the first time the user wants to edit a regular expression with RegexMagic. For efficiency, do not call the Disconnect() method until your application terminates (at which point it is called automatically). Assign an event handler to OnFinishRegex. Calling TRegexMagicIntf2.Connect() raises an exception if the user has an older version of RegexMagic. If you want to support older versions, drop a TRegexMagicIntf on your form as well, and call TRegexMagicIntf.Connect() when TRegexMagicIntf2.Connect() fails. Once a call to Connect() succeeds, your application should not make any other calls to Connect() so that only one instance of the TRegexMagicIntf component is live. Otherwise, it will launch multiple instances of RegexMagic. Doing so would not cause any errors, but does waste resources.

In Visual Studio, right-click on “References” in the Solution Explorer, and pick “Add Reference”. Switch to the COM tab, and choose “RegexMagic API” version “2.0” from RegexMagic2.exe. After adding the reference, import the RegexMagic namespace with using RegexMagic. Then you can easily access the RegexMagicIntf2 class. Create a new object from this class the first time the user wants to edit a regular expression with RegexMagic. Use the same object for all subsequent times the user wants to edit the same or another regex. Do not delete the object until your application terminates. Each time you create an object of RegexMagicIntf2, a new RegexMagic instance is launched. You also need to create an instance of IRegexMagicIntfEvents_FinishRegexEventHandler and assign it to the FinishRegex event of your RegexMagicIntf2 object. Do not assign both. Only assign the one you will actually use.

Note that to successfully communicate with RegexMagic, two-way communication is required. Your application not only needs to call the methods of the RegexMagicIntf2 interface to send the regular expression to RegexMagic. It also needs to implement an event sink for the FinishRegex method defined in the IRegexMagicIntfEvents interface. Not all development and scripting tools that can call COM automation objects can also implement event sinks. The tool must support “early binding”.

RegexMagicIntf Interface

The RegexMagicIntf COM automation interface provides the following methods:

void IndicateApp(BSTR AppName, uint Wnd)

Call IndicateApp right after connecting to the COM interface. AppName will be displayed in RegexMagic’s caption bar, to make it clear that this instance is connected to your application. Wnd is the handle of your application’s top-level form where the regex is used. RegexMagic will call SetForegroundWindow(Wnd) to activate your application when the user is done with RegexMagic (i.e. after a call to FinishRegex). You can call IndicateApp as often as you want, as long as you also call it right after connecting to the COM interface.

uint GetWindowHandle()

Returns the window handle of RegexMagic’s top window, which is either the main window, or the topmost modal dialog box in RegexMagic. Pass this to SetForegroundWindow() after calling InitRegex or InitAction. SetForegroundWindow() only works when called by the thread that has input focus, so RegexMagic cannot bring itself to front. That is your application’s duty. (Likewise, RegexMagic will bring your application to front after calling FinishRegex, using the window handle you passed on in IndicateApp().)

void SetOptions(BSTR StringStyle, BSTR RegexFlavor, BSTR ReplaceFlavor, VARIANT Options)

Do not call SetOptions when using the RegexMagicIntf2 interface as that will make RegexMagic 2 fall back to RegexMagic 1 support. Call SetOptions2 instead.

RegexMagic 1 requires you to call SetOptions before calling InitFormula the first time. After that, you only need to call SetOptions if your application wants to use different options.

StringStyle expects a string style identifier. It tells RegexMagic which string format it should use when returning the regular expression via FinishRegex. By using StringStyle, you can let RegexMagic take care of adding and removing quote characters, escaping characters, etc. If you pass an unsupported string type, “as is” will be used instead.

RegexFlavor and ReplaceFlavor tell RegexMagic which regular expression flavor and which replacement text flavor RegexMagic should use to generate the regular expression. The user will not be able to select different flavors.

Options expects an array of 4 integer values that indicate which options your application can set outside of the regular expression. If you pass NULL as the Options parameter, RegexMagic assumes your application supports all options, as if you had specified 1 for each value.

IndexTypeOption
0intDot matches line breaks
1intCase insensitive
2int^ and $ match at line breaks
3intFree-spacing syntax

You can specify 3 values for each option:

ValueMeaning
0The option is always turned off in the application.
1The option can be turned on or off in the application. RegexMagic will tell your application whether it wants to turn the option on or off via the Options parameter of FinishRegex.
2The option is always turned on in the application.

If you set an option to 0 or 2, and the string style does not include regex flags (e.g. /ismx in Perl), RegexMagic uses mode modifiers instead, if the chosen regex flavor supports them. If the flavor does not support mode modifiers, and the string style does not support regex flavors, then the options your application cannot set will be unavailable. That prevents the user from generating regular expressions that rely on those options being on or off.

void InitFormula(BSTR Formula)

Call InitFormula to make RegexMagic show up with the given RegexMagic formula. RegexMagic will follow up with a call to FinishRegex when the user closes RegexMagic by clicking the Send To button.

Formula is the RegexMagic formula (Samples, Match, and Action panel settings) to be used as the basis for creating the regular expression. Pass NULL or an empty string when you call InitFormula the first time to start with a blank formula. Pass the formula returned by FinishRegex if the user wants to continue editing the same regular expression. If you pass NULL upon the second call, RegexMagic shows up with the same formula the user edited last time, provided RegexMagic has not been shut down between calls. You should treat Formula as an opaque string that your application can use to restore RegexMagic’s state. You can persist this string with your application’s settings. If your application uses regular expressions in multiple areas, you can persist multiple formulas and make RegexMagic switch between them depending on the area the user is working in.

Since a RegexMagic formula includes samples, passing a formula to InitFormula replaces any samples you may have added with AddSampleString and AddSampleFile. If you want RegexMagic to edit an existing formula using new samples, call ClearSamples, AddSampleString, and/or AddSampleFile immediately after calling InitFormula, instead of before.

void ClearSamples()

Call ClearSamples to clear the Samples panel in RegexMagic. This clears out all samples set by previous calls to AddSampleString and AddSampleFile.

void AddSampleString(BSTR Caption, BSTR Sample)

Call AddSampleString to add (a sample of) the data your application will use the regular expression on. RegexMagic adds it to the Samples panel. That way the user can instantly test the regex on the data it’ll actually be used on. You should only use AddSampleString if the data is not stored as a file on disk. Otherwise, AddSampleFile is more efficient. You can call AddSampleString multiple times to provide multiple samples. Use the Caption parameter to give each sample a different caption in the list on the Samples panel.

void AddSampleFile(BSTR Caption, BSTR Filename)

Call AddSampleFile to make RegexMagic load one of the files your application will use the regular expression on. RegexMagic adds it to the Samples panel. That way the user can instantly test the regex on the data it’ll actually be used on. You can call AddSampleFile multiple times to provide multiple samples. Use the Caption parameter to give each sample a different caption in the list on the Samples panel. If you don’t specify a caption, the file’s name is used.

RegexMagicIntf2 Interface

RegexMagic 2.0.0 and later provide an additional interface called RegexMagicIntf2. This interface descends from RegexMagicIntf, so all the above methods are also supported by RegexMagicIntf2. You should this interface to make use of the new functionality in RegexMagic 2. If your application is unable to instantiate this interface, then it can fall back on RegexMagicIntf if you want to support RegexMagic 1 also.

BOOL CheckVersion(uint Version)

Call CheckVersion() before calling SetOptions2 with an application identifier that is not supported by RegexMagic 2.0.0. RegexMagic 2.0.0 returns TRUE if you pass 200 for Version and FALSE for all other versions. Since RegexMagic 2.0.0 is the first version to support this call, you cannot query for 1.x.x support. RegexMagic 2.0.0 supports all regex flavors supported by 1.x.x.

Future versions of RegexMagic will continue to return TRUE if you pass 200 for Version. They will return TRUE for additional version numbers if (and only if) new applications or regular expression flavors are added.

void SetOptions2(BSTR StringStyle, VARIANT Application, VARIANT Options)

Call SetOptions2() before calling InitFormula the first time. After that, you only need to call SetOptions2 if your application wants to use different options.

StringStyle expects a string style identifier. It tells RegexMagic which string format it should use when returning the regular expression via FinishRegex. By using StringStyle, you can let RegexMagic take care of adding and removing quote characters, escaping characters, etc. If you pass an unsupported string type, “as is” will be used instead.

Application can be set to a string with an application identifier. If the identifier’s age is younger than RegexMagic 2.0.0 then you can only use it if CheckVersion() returned TRUE for the identifier’s age. E.g. if the identifier’s age is 2.1.0 then you can only use it if CheckVersion(210) returns TRUE. There is no need to call CheckVersion() for identifiers introduced in RegexMagic 2.0.0 or 1.x.x.

Alternatively, Application can be set to a variant array that defines a custom application.

IndexTypeDescription
0BSTRName the flavor is indicated with in flavor selection lists. This is only used if the custom application is unknown to RegexMagic. If the user previously added the same application under a different name, then the user’s chosen name will be preserved.
1BSTRRegular expression flavor identifier
2BSTR or NULLReplacement flavor identifier. You can set this to the empty string or NULL to disable Replace mode if the application cannot search-and-replace using a regex.
3BSTR or NULLSplit flavor identifier. You can set this to the empty string or NULL to disable Split mode if the application cannot split strings using a regex.
4BSTR or NULLString style identifier. Used only for the Copy and Paste menus in RegexMagic. Not used to pass regexes via the COM interface. Defaults to “asis” if omitted.
5BSTR or NULLFile name without extension of the RegexMagic source code template for generating source code snippets on the Use panel. This can be a built-in or a custom template. Use panel is disabled if you don’t specify a template file name or if the file cannot be found.

Options expects an array of 4 integer values that indicate which options your application can set outside of the regular expression. If you pass NULL as the Options parameter, RegexMagic assumes your application supports the same options as the regular expression flavor specified by the Application parameter (whether explicitly as a custom application or implicitly with an application identifier).

IndexTypeOption
0intDot matches line breaks
1intCase insensitive
2int^ and $ match at line breaks
3intFree-spacing syntax

You can specify 3 values for each option:

ValueMeaning
0The option is always turned off in the application.
1The option can be turned on or off in the application. RegexMagic will tell your application whether it wants to turn the option on or off via the Options parameter of FinishRegex.
2The option is always turned on in the application.

If you set an option to 0 or 2, and the string style does not include regex flags (e.g. /ismx in Perl), RegexMagic uses mode modifiers instead, if the chosen regex flavor supports them. If the flavor does not support mode modifiers, and the string style does not support regex flavors, then the options your application cannot set will be unavailable. That prevents the user from generating regular expressions that rely on those options being on or off.

void AddSampleString2(BSTR Caption, BSTR Sample, VARIANT Options)

AddSampleString2 does the same as AddSampleString but takes a third parameter that allows you to control how RegexMagic interprets the sample string. You should pass an array with 4 elements. If you don’t need these options, then you can use AddSampleString instead.

IndexTypeDescription
0BSTRString style identifier that indicates how RegexMagic should interpret the Subject string. Defaults to asis if you do not specify a valid string style identifier.
1BSTRIf your application converted the string from a particular encoding to the Unicode BSTR needed for the Subject argument, then you can specify one of the text file encoding identifiers to have RegexMagic convert the string back to that encoding so the user can work with the same encoding on the Samples panel in RegexMagic as in your application. The default is utf16le which is the native encoding of the COM interface.
2BSTRSpecify file, page, or line to set the scope to “whole file”, “page by page”, or “line by line”. Pass NULL or an empty string to leave the scope unchanged.
3BSTRYou should set this to mixed to load the string without changing its line breaks. Other supported line break options are auto, crlf, lf, and cr.

void AddSampleFile2(BSTR Caption, BSTR Filename, VARIANT Options)

AddSampleFile2 does the same as AddSampleFile but takes a third parameter that allows you to control how RegexMagic reads the sample file. You should pass an array with 4 elements. If you don’t need these options, then you can use AddSampleFile instead.

IndexTypeDescription
0BSTRSpecify one of the text file encoding identifiers to interpret the file as a text file with the given encoding. Specify bytes to load the file as a binary file in hexadecimal mode. Pass NULL or an empty string to have RegexMagic auto-detect the encoding.
1BSTRSpecify file, page, or line to set the scope to “whole file”, “page by page”, or “line by line”. Pass NULL or an empty string to leave the scope unchanged.
2BSTRSpecify auto convert the line breaks in the file to the style expected by the regex flavor. Specify mixed to load the file without converting its line breaks. Specify crlf, lf, or cr to convert the line breaks to the specified style. The conversion is only applied to the copy of the file in memory. The file on disk is not changed. Pass NULL or an empty string to leave the line break mode unchanged.

RegexMagicIntfEvents Interface

FinishRegex(BSTR Regex, BSTR Replacement, VARIANT Options, BSTR Formula)

You must provide an event handler for this event to receive the regular expression generated by RegexMagic. After you call InitFormula, RegexMagic will call FinishRegex when the user closes RegexMagic by clicking the Send To button. The call to InitFormula returns immediately. The regular expression and replacement text returned by FinishRegex use the flavors and options you passed to the most recent call to SetOptions.

RegexMagic passes an array of BOOL values to Options to indicate which options your application should turn on (TRUE) or turn off (FALSE). If you called SetOptions2() then the array will have 12 elements. If you called SetOptions(), then the array will have only 4 elements (indexes 0 to 3 in the table). The number of elements does not depend on the the parameters you passed to SetOptions2() or SetOptions(). It only depends on which of these two methods you called. Options not supported by your application are still included in the array.

IndexTypeOption
0BOOLDot matches line breaks (TRUE) or dot doesn’t match line breaks (FALSE).
1BOOLCase insensitive (TRUE) or case sensitive (FALSE).
2BOOL^ and $ match at line breaks (TRUE) or ^ and $ don’t match at line breaks (FALSE).
3BOOLFree-spacing (TRUE) or exact spacing (FALSE).
4BSTRLine break style.
One of: default, lf, cr, crlfonly, crlf, unicode
5BOOLNamed capture only (TRUE) or numbered capture (FALSE).
6BOOLAllow duplicate names (TRUE) or require names to be unique (FALSE).
7BOOLLazy quantifiers (TRUE) or greedy quantifiers (FALSE).
8BOOLSkip zero-length matches (TRUE) or allow zero-length matches (FALSE).
9int or NULLLimit for “split” actions. Set to a NULL variant to indicate that no limit is set at all (which is not the same as setting the limit to zero).
10BOOLSplit: add groups (TRUE) or don’t add groups (FALSE).
11BOOLSplit: add empty strings (TRUE) or don’t add empty strings (FALSE).

RegexMagic also returns the formula that was used to generate the regular expression. Your application can save this formula and pass it to the next call to InitFormula if the user wants to edit the same regular expression.

If the user closes RegexMagic without clicking the Send To button, your application will not receive a call to FinishRegex.

Which Methods to Call and Events to Handle

In summary, you must call IndicateApp immediately after connecting to RegexMagic. Then call SetOptions2 for RegexMagic 2 or SetOptions for RegexMagic 1 to indicate your application’s capabilities. Then, call InitFormula each time the user wants to create a regular expression with RegexMagic. RegexMagic follows up with a corresponding call to FinishRegex. If you call InitFormula again before receiving a call to FinishRegex, the effects of the previous call to InitFormula are canceled.