Import data

Importing a data file

queXC can import a DDI/Fixed width data file combination, or a CSV file.

Import DDI/Fixed width data

Select a DDI and Fixed width data file to upload, and give it a description

Import CSV Data

Select a CSV file to upload and give it a description. The CSV file must have a header row (first row containing variable descriptions) and each row must have the same amount of columns as the header row.

Add data to existing (Fixed width)

Select a data file which has more records than the data currently in queXC. The data file must contain a unique identifying variable that queXC can use to determine if the record has already been inserted (for example, a case id variable). Select this variable from the list of available variables to import the new rows from the data file.

Add data to existing (CSV)

As for fixed width, except for uploading a CSV file.

Codes

Importing a code group

A code group is a set of codes which may be hierarchical. A code group CSV file must contain the following 4 fields (without a header row)

code,description,keywords,parent_code

Only keywords and parent_code may be blank.

The 'code' is the value that gets assigned in the data file when the code is selected
The 'description' is what is visible to the operator when selecting a code
The 'keywords' are not visible to the operator, but are used when queXC automatically tries to assign a code
The 'parent_code' field contains the parent of a code if the code scheme is hierarchical.

An simple "Yes or No" code group looks like this (not hierarchical, no keywords):

1,"Yes",, 2,"No",,

A more complex hierarchical code group with keywords looks like this:

1,"Plant",, 2,"Animal",, 11,"Vegetable",,1 12,"Fruit",,1 13,"Other plant",,1 21,"Mammal",,2 22,"Reptile",,2 23,"Bird",,2 24,"Fish",,2 111,"Carrot",,11 112,"Pumpkin",,11 121,"Apricot",,12 122,"Orange",,12 123,"Apple",,12 131,"Bottle-brush",,13 211,"Human",,21 212,"Chimpanzee","Chimp",21 221,"Crocodile","Croc",22 231,"Sparrow","Common sparrow",23 232,"Seagull","Gull Sea-bird",23 241,"Flathead",,24 242,"Whiting",,24 243,"Snapper",,24

Import code keyword correspondence

Code keyword correspondence is a listing of keywords that relate to a code. Use this function to import a CSV file containing 2 columns, where the first column is the code value, and the second column is the keyword. If multiple keywords apply to a single code, they should appear on new lines in the CSV file.

Once a code keyword correspondence has been uploaded, it is possible to assign a field in the data file that should be automatically read by queXC and then coded using the keyword correspondence, where the keyword exactly matches the data in the data file.

An example of this system would be:

You have data from a questionnaire where you have asked the questions: "What industry do you work in?" and "Who is your employer?". You then want to use queXC to code to the ABS ANZSIC coding scheme (Australian and New Zealand Standard Industry Classification). If you have a list of employer names, and their associated ANZSIC code, these can be uploaded using this feature. Then, although an operator would typically classify responses from the question "What industry do you work in?" - a keyword correspondence can be used from the list of employers to automatically code a response.

Also, see "Assign keyword groups to columns (correspondence)" below.

Creating a code group

The easiest way to create a code group is to open the blank.csv code group from the doc/ subdirectory of the queXC installation, and modify it.

Selecting a blank code for code group

queXC can automatically assign a code to a data record when the data record is blank/empty. This saves an operator having to do it manually. Make sure to include a code such as "Blank" in your code group, and then use this tool to assign that code as the blank code.

Create a process from a code group

Once a code group is created, a process must be created to make it available to operators. Select the parent process of the code group if you wish for other processing to be done before coding commences (for example, spell checking)

Operator management

Add operators

queXC relies on the web servers underlying authentication methods for authentication of operators. Add operators user-names in here to make sure queXC knows which operators are allowed to use the system.

Assign operators to data

Operators may be assigned to particular data files, and will only receive work on that data file once assigned.

Assign operators to processes

Select an operator, then the processes they are allowed to operate. The field "Allow queXC to guess code" means that when working on this process, queXC will try to guess the correct code given the data in the record.

Assign operators to supervisors (experts) of processes

Use this function to select which operators are supervisors, or "experts" in a particular process. If an operator is assigned as a supervisor, then the operator doing the classification work can refer the current item to the supervisor if they cannot make a decision themselves. The work will then appear for the supervisor.

Job management

Create work

Creating work is the process of selecting which columns in the data are to be processed. Only columns of type 'text' are made available for cleaning/coding. First select the column to process, then the process to apply. A table of operators who are assigned to the selected data file/process will appear. You have the following options:

1. Allow a specific number of any operators to operate the process
Enter how many individual operators to apply this process. Usually this will be: 1
If it is more than 1, queXC will require that the given number of operators will apply the process to the column of data.

2. Allow specific operator(s) to operate the process
Select the specific operators to apply the process
Only this/these operator(s) will be able to apply this process.

Reference column:

Then you have the option of selecting a reference column, this is mainly useful when using the process "Create new code group existing (code other)". This process is used in the situation where a text field needs to be coded, but the coding is based on an existing code in the data that can be updated. This happens most regularly in a questionnaire where questions such as these appear:

Q1: How did you get to work today? 1 - By car 2 - By train 3 - Other

Q2: If you selected 3 - Other, please specify how you got to work today:
_________________________________________________________________

In this case, you may decide to use queXC to update the coding scheme in Q1 to reflect the responses given when respondents selected "3 - Other" and entered a response.

Therefore in this case, you would choose Q2 as the variable to code, select the process "Create new code group existing (code other)", and then choose Q1 as the reference column.

Multiple choice coding scheme:

If you wish to code a response to multiple columns/groups, select a code group here to generate the groups from. An example of coding to multiple groups is when you have a question such as:

Q3: What government services do you use, and how often do you use them?

Then you would have a coding scheme for "how often", such as:

1,Very often 2,Often 3,Rarely 4,Never

And a second coding scheme of government services, such as:

1, Welfare 2, Schools 3, Roads 4, Courts

This second coding scheme would be your grouping scheme, and would produce one column per code in your data file. Then it will be possible to select any, all or no "government services" and rate them on the first "how often" scale. It will also be possible for the coding/classifying operator to select the text of the response that applies to the particular "government service" that is being coded, and this will appear as a new column in the data file that aligns with the selected government service.

Comparison work:

If coding work has been created on the same column for 2 operators, it is possible to create "comparison work". Comparison work is a new job that compares the previous operators classifications for the column, and then automatically assigns the code "Identical" if they are the same. If they are not the same, the system will present the differences to a third operator, who can choose between the two responses, acting as a "third umpire".

Assign relevant columns to codes

For each column, it may be useful for the operator to be able to see data from other columns in the data file. Use this tool to select the relevant columns to display for each column and process being applied.

Assign keyword groups to columns (correspondence)

This screen allows the administrator to select a column from the data file that a keyword group should apply to, based on existing work created. queXC will then scan those records and where there is a keyword match, will apply the given code to the work created, and mark it as complete.

See "Import code keyword correspondence" above for a description of when this would be useful

Progress

Display progress

This shows how much work is left to be done. By deleting a record, you will not delete any of the work previously done, but stop future work from occurring. You can see the result of your changes in the 'create work' page

List all work

This lists work that has been done, and is assigned to be done in the near future. It is possible to delete a work unit - this is safe as a new work unit will be automatically created again and assigned to the next available operator. Deleting work is useful if an operator has not signed off the system correctly, and it needs to be assigned to someone else.

List work assigned but not complete

This lists work assigned to an operator that is not yet done. Assigned work will appear regularly as operators are working, but if they have not logged off correctly or you wish to assign their work back to the pool, you can use this to delete any assigned records to them.

Modification history

A list of all changes to the data file. Shows all instances where a new revision of the data is made, and the data. You can break down the view by selecting a specific column and/or row. Note: Rows in queXC start at 0.

Performance

Operator performance

This lists the average time in seconds it takes for a particular operator to do a particular process. You may select which operators or processes to restrict the list by for more specific information.

Export

Exporting data

Data may be exported in the following formats:

1. Fixed width
All the data in an ASCII file of fixed width
2. DDI
A DDI description of the fixed width data (XML)
3. PSPP with data inline
The data and description of the data in a file that can be loaded in to PSPP (or SPSS)
4. PSPP
Just the data description without the data (can refer to the fixed width file)
5. CSV
A CSV dump of the file (does not include data description)
6. CSV with labels for codes
A CSV dump of the file (does not include data description, but replaces code values with code labels)

Update data description

Use this feature to update or add question and response/code labels for each question in a data file. The results of these will appear both while coding, and in the data descriptions when exporting the data

List data

Use this field to see the current status of a column in the data file

Export code keyword correspondence

Export a CSV file by selecting a code, and then a text field. You can then use this correspondence file to upload in to the keyword correspondence system for coding subsequent data files

Exporting code groups

Code groups may be exported as CSV files to share or backup.

queXC Administration Manual