Features

Data Splitter is a Windows desktop application that extracts search items from a variety of input sources.   The search items can be simple text strings, or patterns of varying complexity.   Currently supported input sources include :

The program is configured with "rules" that specify :

The rule sets define search items, actions to perform, and other parameters, and are also known as "solutions".

The program installations come with a dozen or so sample solutions.   Most of these have been set up to output regular text and HTML files - so that prospective users can try out the program without having to configure a database.

The program's real power, though, comes with its ability to generate databases with the found items.   Data Splitter uses ODBC, "Open Database Connectivity", to interact with databases.   It can be used with any database product that supports ODBC - in other words most DBMSs.  

Solutions within solutions

Data Splitter can also perform multiple passes over patterns using multiple solutions.   A solution can isolate patterns and then use another solution to break the pattern down further.   This subtle but powerful feature enables :

Using solutions within solutions is also referred to as "recursion", or "recursive" capability.

Feature summary

Feature: In detail:
File I/O Standard Windows file input and output is supported.
HTTP input A Hypertext Transfer Protocol module fetches web pages.
Clipboard I/O Windows clipboard reading and writing is supported.
MAPI interface A Messaging Application Programming Interface module displays folder lists, scans emails, has a new-emails-only mode, and supports Simple MAPI.   This allows Data Splitter to parse MS Outlook emails.
ODBC interface An Open Database Connectivity interface displays database structure, executes SQL statements and writes fields / rows to ODBC databases.
MS Word interface Data Splitter can use Microsoft Word to convert .DOC files to text prior to scanning.
Requires Microsoft Word 97 or later.
Recursive capability Solutions can use other solutions, without limit, to perform multiple-pass data transformations.
GUI and command line versions Program is available as Windows GUI and command window executables.
Database-driven command line program The command line program can be database-driven, allowing complex multi-step input (URL) scans and SQL executions.
Quick-start menu A quick-start button menu provides a short list of buttons for commonly-used menu items.
Timer A timer options dialog allows the user to specify the start time for the next run and the interval between runs.
Trace mode To assist with development and debugging of solutions, a trace mode can be toggled on and off.   When trace mode is "on" an HTML file is produced showing the patterns recognized in the input.
ASCII + Unicode support The program's character recognition and output actions support both 8-bit and 16-bit character sets.
Help Data Splitter has program help describing all aspects of solution configuration.   There is also a tutorial to help new developers get started.
Extension DLL interface Data Splitter provides an action that calls a user-defined DLL (Windows dynamic link library).   Advanced users can extend Data Splitter with this feature.
Development Tools Data Splitter Development Tools is an SDK (software development kit) that enables advanced users to call the Data Splitter DLL from other applications.

Planned features

Planned enhancements include :

Next generation

A next-generation product will determine the user's needs and provide a solution after a question-and-answer session and analysis of input samples provided by the user.   Design and details TBD.   This is where the fun really begins.