Steps to converting non-analyzable wage, time, and business electronic data

When manual data entry of non-analyzable financial or wage data  is not an option, OCR software and specialized designed and written computer software data cleaning routines is a good alternative.

For example in our approach, we use a number of OCR programs including Abbey Reader to first translate the data into a format that is recognized by statistical programs such as STATA and computer software script languages such as VBA.

Once the data is converted, we write specialized computer software routines to extract the relevant data from the converted file.  The computer code, which is written in STATA, VBA, or other scripting language, puts the extracted data into a format that can be analyzed by statistical and spreadsheet programs.

These approach to converting wage, business, employment or other types of data has the advantage of being able tobe  reproduced by either party if required.

Having both the data cleaning and statistical and economic analysis performed by the same economic outfit and team is desirable.  Data cleaning is not performed in a vacuum; that is the very definition of ‘dirty data; depends on what the data is to be used for.  Some data items may not convert very well by the OCR and software code, but the items may be of little value in the economic and statistical analysis in the first place.

One advantage of using the same research outfit to do both the data cleaning and the economic and statistical analysis is that the distinction gets made early in the analysis process.

 

Related Posts

Posted by Matt Rigling | U.S. Economy

STATA statistical code for estimation of Millimet et al. (2002) econometric worklife model

The STATA code for estimating the Millimet et a;. (2002) econometric worklife model can be found below. The code  will need to be adjusted to fit your purposes. However, the […]

Posted by Matt Rigling | U.S. Economy

A narrative description of the Millimet et. al (2002) econometric worklife model

The following describes the approach used by Millimet et al (2002) to estimate U.S. worker worklife expectancy. The pdf version can be found here: Millimet (2002) Methodology Description  Methodology First, transition […]

Posted by Matt Rigling | U.S. Economy

Big BLS employment data, disability, and worklife expectancy

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and […]

Posted by Matt Rigling | BLS Data | Earnings | Industry | U.S. Economy | Wage and hour cases

FLSA OT report for individuals working in Derrick, rotary drill, and services unit operators, oil, gas, and mining occupations

In this post, we look at the weekly overtime (OT) hours typically worked by those who work in Derrick, rotary drill, and services unit operators, oil, gas, and mining occupations. Many […]

Posted by Matt Rigling | U.S. Economy

Younger workers today have slightly less attachment to the workforce than younger workers in the past

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and […]

Posted by Matt Rigling | BLS Data | Job openings | U.S. Economy

Elementary and Middle School Teachers experienced the largest increase of job openings nationwide for Dec

Elementary and Middle School teachers experienced the largest increase of new openings of all occupations in the US for the month of December with 4,017 new job openings. Month Occupation […]

Posted by Matt Rigling | U.S. Economy

Replication of the Millimet et al. (2002) work was sufficient and yielded similar results

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and […]

Posted by Matt Rigling | BLS Data | Job openings | U.S. Economy

Tallahassee, FL experienced largest increase in job openings of all US MSAs for Dec

The Tallahassee, FL MSA (metropolitan statistical area) experienced the largest increase of job openings of all MSAs in the United States for the month of December with 155 new openings. Month MSA Total […]

Posted by Matt Rigling | BLS Data | Earnings | Industry | U.S. Economy | Wage and hour cases

FLSA OT report for individuals working in roofing occupations

In this post, we look at the weekly overtime (OT) hours typically worked by those who work in roofing occupations. Many of the employees that work in these jobs are not exempt […]

Posted by Matt Rigling | U.S. Economy

Steward and Gaylor (2015) Matched CPS Sample Sizes for 1993-2013 time period

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and […]