Preparing your data for ORDS

About this guide

This document is one of a series of HOW-TO guides for the Online Research Database Service (ORDS). It will tell you how to prepare an existing dataset for importing into ORDS.

Preparing your data for ORDS

Before you can upload your data to ORDS, it is important to ensure that it is in the right format. Two types of file can currently be imported: Microsoft Access files (for relational databases and other tabular data), and XML files. (There are plans for future versions of ORDS to offer the ability to upload other file formats.)

If you have data tables which are stored in another file format (such as Microsoft Excel, another spreadsheet package, or a statistical analysis package), you will need to convert these into Microsoft Access files before they can be uploaded to ORDS. Instructions for doing this are given in sections Preparing Microsoft Excel files for ORDS and Preparing other tabular file formats for ORDS below.

If you do not have your own copy of Microsoft Access, the software is available on the computers in the OUCS Help Centre. There may also be computers in your college or department which have the software. Once you have uploaded your data to ORDS, you will be able to access and edit your data through the ORDS online interface; it is not necessary to continuing using Access.

Preparing Microsoft Access files for ORDS

The process of importing Access files into ORDS is straightforward; nevertheless, there are a few things you need to be aware of before you do this.

File names, table names, and field names

The software used by ORDS imposes tighter restrictions on the names of database files and the objects within databases than Access does. This means you may need to edit these names to ensure your data is compatible with ORDS. In particular:

  • File names, table names, and field (column) names should consist only of letters, numbers, and underscores (_), without any other special characters. (If special characters are included, in some cases this will prevent the database tables from being generated; in others, ORDS will generate the tables, but will strip out the special characters that would otherwise cause a problem.)
  • It is also good practice to avoid spaces in these names. Underscores can be used as an alternative way of separating words.
  • Field and table names must start with a letter. If you have numerical names, you will need to add one or more letters to the beginning of them – for example, the field name 2012 might become Year_2012 , or just Y2012.
  • Field names in ORDS will be displayed with an initial capital letter, but other capitalization will not be retained. This means that ‘camel case’ cannot be used in ORDS field names: for example, AuthorAge would display as Authorage. Once again, underscores can be used as an alternative here.

These restrictions apply only to the names of files, tables, and fields, not to the data contained within database records.

Field order

Unfortunately, when files are uploaded to ORDS from Access the order of fields within a table is not retained: instead, the fields will be sorted alphabetically (apart from the primary key field, which will appear as the first field in the table).

If you wish to retain the original field order, you will therefore need to add a prefix to each field name before uploading the file. This could be a letter followed by an underscore (A_ for the first field, B_ for the second, and so on), or one or more letters followed by a number and an underscore (for example, F01_ for the first field, F02_ for the second, and so on: F here simply stands for ‘field’, but any letter could be used). The prefix will ensure that when ORDS sorts the fields, the order remains unchanged.

Formatting

You should also note that text formatting (e.g. coloured text, different fonts, etc.) will not be retained when the data is imported into ORDS.

Forms, queries, and reports

At present, only database tables (and their contents) can be uploaded to ORDS. If an Access file includes other database objects such as forms, queries, or reports, these will not be accessible via the ORDS interface.

Preparing Microsoft Excel files for ORDS

Access provides tools for importing Microsoft Excel files. This guide provides instructions for Access 2007; earlier or later versions of Access may look slightly different, but the basic process is similar.

Preparing your data

Before importing your Excel file, there are a few things you should check:

  • Ensure that there are no unwanted blank rows or columns in your Excel worksheet(s). It is fairly common for people to leave empty rows at the top of a spreadsheet, or to use empty columns for spacing purposes. These will appear as empty records or fields in Access (and therefore in ORDS), and should be removed. Any title rows or other information that refers to the whole worksheet (rather than to a specific column) should also be removed.
  • Ensure that each column has a heading, and that this fits into a single cell at the top of the column. These headings will become the field names in Access and in ORDS.
  • Ensure that each row contains all the relevant information. In Access and in ORDS, each row will be treated as a discrete record. If you have a heading or sub-heading with a number of rows beneath it, Access will not recognize that the rows are associated with the heading: instead, you will need to include the information contained in the heading in each row connected with it.
  • You should also note that text formatting (e.g. coloured text, different fonts, etc.) will not be retained when the data is imported into ORDS.

For example, consider this spreadsheet:

02 Example spreadsheet before.gif

To prepare this for importing into Access, the title and blank row and column should be removed, and the column headings should each be put into a single cell. The gaps in the Month of birth column also need to be filled in: someone looking at the spreadsheet may deduce that Andrew and Mark were born in January, but there is nothing to tell Access that this is the case.

This would give a spreadsheet looking like this:

02 Example spreadsheet after.gif

You will also need to ensure that your data meets the stipulations for file names, table names, and field (column) names given in section Preparing Microsoft Access files for ORDS above. If you need to make changes to your field names, it may be easier to do this in Excel, before importing the file into Access.

The finished result might end up looking something like this:

02 Example spreadsheet after2.gif

This dataset is now ready to be imported into Access.

Importing your data into Microsoft Access

To import the file, open a new blank database in Access. The program will prompt you to give the database a name, and to choose where to save it.

Click the External Data tab in the ribbon at the top of the screen, then click Excel in the Import group of icons.

In the pop-up window that appears, click the Browse... button, and select the file you wish to import. Ensure that the radio button for Import the source data into a new table in the current database is selected, then click OK.

The Access Import Spreadsheet Wizard will now take you through a series of steps to import your file.

If your Excel file has multiple worksheets, you can import these into Access as separate tables within the same database. However, you will need to import each table separately. Access will ask you to select the worksheet you wish to import.

02 Access import 1.gif

Click Next > to continue.

To ensure that Access uses your column headings as field names, tick the box labelled First Row Contains Column Headings.

02 Access import 2.gif

Click Next > to continue.

The Wizard will invite you to specify information about the fields you are importing. In most cases, you can leave this as the default values. However, if any of your fields contain a large amount of text (more than 255 characters), you should select the field by clicking on it, and then use the Data Type pull down menu to set the field type to Memo rather than Text. If you don’t do this, you may find that the text is truncated.

02 Access import 3.gif

Click Next > to continue.

Access will invite you to define a primary key for the table. A primary key is a field which is used to uniquely identify each record in a database table, and consequently it is essential that the primary key value for each record is not shared by any other records in that table. It is usually simplest just to let Access add a primary key (which it will do by giving each record an ID number), though in some cases you may wish to select one of the existing fields in your database to serve as the primary key.

02 Access import 4.gif

Click Next > to continue.

At the end of the process, Access will invite you to provide a name for the table the data is about to be imported into.

02 Access import 5.gif

Click Finish and then Close to complete the process.

The name of the newly created table will appear in the left-hand sidebar. Double-click this to open it and ensure everything is as expected. (Occasionally, Access may import blank rows or columns from the end of the spreadsheet as empty fields or records in the table; it is best to delete these before importing the data into ORDS.)

If Access encounters any problems in importing the data, it will create a second table listing possible errors, and indicating which fields of which records are affected. Once you have checked these records (and hopefully resolved any problems), this second table can be deleted.

Repeat the above steps for each worksheet you wish to import. Your dataset is now ready to be imported into ORDS. For guidance on how to do this, please see the next HOW-TO in the series: Importing an existing database.

Preparing other tabular file formats for ORDS

If your data consists of one or more flat files (that is, individual tables of data which do not contain links or relationships to data in other tables – also known as rectangular data), you can save the data as a tab-delimited or comma separated value (.csv) file. This converts your data table into a plain text file, with the columns or fields separated by tabs or commas. This file can then be imported into Access.

If you are using a relational database system other than Access (for example, FileMaker Pro), please contact the ORDS help desk on xxxxx@xxx.oucs.ox.ac.uk for advice on importing your data.

Converting your data to tab-delimited or .csv format

Many software applications (including statistical packages such as Stata and SPSS) include tools for exporting data in these formats: see the documentation or help files in the program you are using for advice on how to do this.

Before saving your data in one of these formats, you should first ensure:

  • That there are no unwanted blank rows or columns in your tables. It is fairly common for people to use empty rows or columns for spacing purposes, but these will appear as empty records or fields in Access (and therefore in ORDS), and should be removed. Any title rows or other information that refers to the whole table (rather than to a specific column) should also be removed.
  • That each column has a heading, and that this fits into a single cell at the top of the column. These headings will become the field names in Access and in ORDS.
  • That each row contains all the relevant information. In Access and in ORDS, each row will be treated as a discrete record. If you have a heading or sub-heading with a number of rows beneath it, Access will not recognize that the rows are associated with the heading: instead, you will need to include the information contained in the heading in each row connected with it.

See section Preparing Microsoft Excel files for ORDS above for a worked example of preparing a data table for importing into Access.

You also need to ensure that your data meets the stipulations for file names and field (column) names given in section Preparing Microsoft Access files for ORDS above. If you need to make changes to your field names, it may be easier to do this before importing the file into Access.

Additionally:

  • Note that if your data includes commas (in text fields, for example), and you save it in .csv format, each comma will subsequently be treated as if it were a column break. In these cases, you are therefore likely to get better results if you use a tab-delimited format.
  • You should also note that any formatting (coloured text or different fonts, for example) will be lost when you save the data in a plain text format.

Once you have created your tab-delimited or .csv file, you can then import this into Access.

Importing your data into Microsoft Access

This guide provides instructions for Access 2007; earlier or later versions of Access may look slightly different, but the basic process is similar.

To import the file, open a new blank database in Microsoft Access. The program will prompt you to give the database a name, and to choose where to save it.

Click the External Data tab in the ribbon at the top of the screen, then click Text File in the Import group of icons.

In the pop-up window that appears, click the Browse... button, and select the file you wish to import. Ensure that the radio button for Import the source data into a new table in the current database is selected, then click OK.

The Access Import Text Wizard will now take you through a series of steps to import your file.

Access should detect that your data is in a delimited format; if it does not, select the Delimited radio button.

02 Access import CSV 1.gif

Click Next > to continue.

Select the appropriate delimiter for your file – this will be either Tab or Comma, depending on your file format.

To ensure that Access uses your column headings as field names, tick the box labelled First Row Contains Field Names.

02 Access import CSV 2.gif

Click Next > to continue.

The Wizard will invite you to specify information about the fields you are importing. In most cases, you can leave this as the default values. However, if any of your fields contain a large amount of text (more than 255 characters), you should select the field by clicking on it, and then use the Data Type pull down menu to set the field type to Memo rather than Text. If you don’t do this, you may find that the text is truncated.

02 Access import 3.gif

Click Next > to continue.

Access will invite you to define a primary key for the table. A primary key is a field which is used to uniquely identify each record in a database table, and consequently it is essential that the primary key value for each record is not shared by any other records in that table. It is usually simplest just to let Access add a primary key (which it will do by giving each record an ID number), though in some cases you may wish to select one of the existing fields in your database to serve as the primary key.

02 Access import CSV 4.gif

Click Next > to continue.

At the end of the process, Access will invite you to provide a name for the table the data is about to be imported into.

02 Access import CSV 5.gif

Click Finish and then Close to complete the process.

The name of the newly created table will appear in the left-hand sidebar. Double-click this to open it and ensure everything is as expected. (Occasionally, Access may import blank rows or columns from the end of the table as empty fields or records; it is best to delete these before importing the data into ORDS.)

If Access encounters any problems in importing the data, it will create a second table listing possible errors, and indicating which fields of which records are affected. Once you have checked these records (and hopefully resolved any problems), this second table can be deleted.

Repeat the above steps for each file you wish to import. If your dataset consists of a number of tables, these can all be added to the same Access database.

Your dataset is now ready to be imported into ORDS. For guidance on how to do this, please see the next HOW-TO in the series: Importing an existing database.

What next?

You may want to have a look at the next HOW-TO in the series:

Importing an existing database

You may also be interested in the whole list of HOW-TOs:
  1. Registering and creating a new project
  2. Preparing your data for ORDS
  3. Importing an existing database
  4. Creating a new database from scratch
  5. Creating and managing copies of your ORDS database
  6. Editing, filtering, and searching data using the ORDS
  7. Editing the structure of a database
  8. Sharing data with colleagues
  9. Creating customized data views
  10. Publishing datasets online
  11. Exporting data from the ORDS
  12. Using non-standard character sets
You can also find out more about the ORDS service by visiting the ORDS home page http://xxxxx.xxx.oucs.ox.ac.uk . If you have specific queries, you can contact the ORDS help desk by emailing xxxxxxxxxxxx@xxxxx.oucs.ox.ac.uk .