Database Subsetting Tool
SourceForge.net Logo
java.net Member Button: 90px wide

 
Frequently Asked Questions

Why am I getting so much data back?             
A Each association will be traversed in both directions, unless there is a restriction defined. For example, if the subject table is employee, and the association from department table to employee table is enabled, you will not only get all departments associated with any subject employee, but also all employees associated with any of that departments.

If you get rows which you don't intend to get, then a restriction is missing. You can use the explain feature in such a case. The explain log shows the chain of associations for each exported row back to a subject row, where the exportation starts. I recommend to read the tutorial, which explains the matters.

Instead of adding all restrictions you need, starting from the unrestricted model, you can also go the opposit way: at first disable all associations (except dependencies) using "Edit->Disable all associations" menu item. Then remove restrictions until you get all the rows you are looking for.
 
It's easier that way to get a working extraction model, but it's harder to ensure that the model isn't too restrictive, i.e. that you will get all rows you intend to get.
 
From my experience the best way to define an extraction model is to use the "Closure Browser" (use "Closure..." button).
 
The closure browser lists all tables which are directly or indirectly associated with the subject table (where extraction starts) according to the restrictions, ordered by distance.
 
Check each table of this list from top to bottom. If you don't want to have rows from the table to be exported, select the table and disable the association which gets automatically selected then (table's context menu is also available here). The table disapears from the list as soon as it is no longer associated with subject. 

This way you easily get an extraction model which extracts rows only from tables you want to extract data from. 

Finally check all associations between the remaining tables.
 
What does $DISTANCE and $IS_SUBJECT mean?
A $DISTANCE and $IS_SUBJECT are special expressions used in restriction conditions in order to define restriction models based on topological properties of rows.

Expression Type
$DISTANCE INTEGER   Minimum length of all paths from the row back to any subject row.
The distance of subject rows is 0.
$BIRTHDAY INTEGER Synonym for $DISTANCE.
$IS_SUBJECT BOOLEAN Synonym for $DISTANCE=0.

 
Q How do I anonymize the extracted data?
A It's possible to anonymize the extracted data by defining filters (via "Edit" menu "Filter editor...", or table's context menu)

A filter is an assignment of a SQL expression to a table column, the column values will be replaced by the result of the expression when the export file is written.
 
For example:
 
In order to substitute the values of the column PERSON.SSN with the constant value '123', define the filter:
PERSON.SSN := '123'

 
The filter
PERSON.SSN := substr(SSN, 1, length(SSN) - 3) || '***'
replaces the last 3 characters of the SSN with '***'
 
The filter:
PERSON.SSN := (SELECT SSN FROM TEST_SSN ORDER BY RAND()
                             FETCH FIRST 1 ROWS ONLY)
would replace the SSN with a randomly chosen SSN from a table holding test SSNs (in IBM's DB2)
 
Q Why doesn't Jailer detect associations in MySQL databases?
A If you use MyISAM database engine, no association will be detected. MyISAM doesn't know the concept of foreign key relationships, even if they're defined in the DDL.
 
The typical alternative engine that supports foreign key relationships is InnoDB.
 
Q Is it possible to avoid the creation of JAILER_* tables in the database?
A Set the working-table scope to "local database" ("Data Export" dialog) to avoid the creation of that tables in your database. The tables will then be created in a local database (H2).
For DB2, Oracle and PostgreSQL, it is also possible to create the tables as temporary tables.
 
Q I'm getting the message "Table has no primary key". What's wrong?
A Jailer cannot handle tables without primary keys. Exporting data is not possible, if a table without a primary key is associated with the subject table. In this case, you should define a key of the table in the tool's data model (not in the database!) manually, using the data model editor. Note that a key must be unique and known (i.e. not NULL).

(On Oracle, however, rowid-pseudo columns can be used instead of primary keys)
 
Q The GUI is slow, what can I do?
A Due to the dynamic nature of the layouting algorithm, the table-association graph is permanently redrawn. It seems that this consumes too much cpu time on some systems. However, there is a parameter called "animation step time" which determines the minimum time (in ms) for one animation step. Increasing that parameter (View->Animation step time) will reduce the cpu utilization.