Speed and Fatigue
STANFORD UNIVERSITY
GRADUATE SCHOOL OF BUSINESSFrom: Roberto M. Fernandez, Ph.D
RE: DataHand® PilotThis will report the results of my reanalysis of the FSM pilot data at Rio Salado. As you know, the Rio Salado study was designed as a pilot study to 1) explore operators' reactions to the DataHand® technology; 2) to demonstrate the feasibility of DataHand®'s use during actual production, and 3) to provide preliminary data on DataHand®'s performance the field. While it has served these purposes well, it is best to keep in mind that this pilot was not conducted as a controlled experimental study. This means that the data from this study may be biased in a number of ways. We would expect these biases to cut in different directions, sometimes working in favor of DataHand®, and other times working against DataHand®. What we cannot know without actually doing a controlled, experimental study is in which direction the biases lean. (It is, of course, possible that these biases might cancel out altogether). I have reviewed the design of the pilot study, and have identified several factors that might result in bias; I review these factors on page 2.
Summary of Findings Based on the Rio Salado Pilot Data
Based on the data I have been given, I am confident that the pilot data show the following descriptive patterns:
Some Caveats: Issues to Consider Before Drawing Conclusions Based on Rio Salado PilotOn average, DataHand® runs showed a throughput (items fed per hour runtime) 5.4 percent higher than runs with operators using only 10-Key keyboards (See Table 1). DataHand® runs showed an average Quality-Adjusted throughput (Items Accepted per hour of Runtime) 6.2 percent higher the average for 10-Key runs (Table 1). DataHand®'s throughput advantages (both Unadjusted and Quality-Adjusted) over 10-Key grow with the length of the run (see Table 2 and Figures 1 and 2). In runs over 6 hours in length DataHand® throughput advantages climb to 12.3 and 10.4 percent for Unadjusted and Quality-Adjusted throughput. While the uncontrolled nature of this study demands we exercise caution in interpretation (see below), this pattern is consistent with the idea that operators using DataHand® are less subject to fatigue than are operators using the IO-Key. Keyboard differences in average quality (defined as Items Accepted/Items Fed) also favored DataHand®. Although the differences between DataHand® and 10-Key were small, they were statistically significant. As I mentioned above, data from this pilot study may be biased in a number of different ways, and these biases might affect the substantive findings presented in this report. The following factors need to be considered before drawing conclusions on the basis of the findings presented here:
Generalizability
The pilot data reflect FSM operations at a particular time and place. Does Rio Salado present any special features which might not be present at other sites and would affect the way DataHand® might be used? Similarly, does the Nov.-Dec. data coupon period present any unique circumstances that might affect DataHand® implementation?
Nature of the Comparison Groups
A) Volunteers vs. Non-Volunteers. DataHand® operators were former 1O-Key operators who volunteered to try DataHand®. If such volunteers are systematically different from those who did not volunteer, then we run the risk of wrongly attributing performance differences to the DataHand® technology when such performance differences might be due to differences between volunteers and non-volunteers. If the people who volunteered for DataHand® are the people who were the fastest of the 10-Keyers, then fast people are overrepresented among DataHand® volunteers, and slow people would be overrepresented among 10-Keyers. This would mean that the performance differences we found here are likely to be overstated. On the other hand, if DataHand® attracted those former 10-Keyers who were experiencing the most pain on the 10-Key (DataHand® was touted as something that should help with pain), then DataHand® volunteers would be among the slower 10-Keyers, and non-volunteers would be among the fastest 10-Keyers. This would imply that the findings here understate performance gains associated with DataHand®. Only with a controlled study can we detemine which of these two biases would predominate. (NOTE: Based on my conversations with both supervisors and operators, my impression is that the latter bias should dominate. All the DalaHand operators told me they experienced some pain-relief compared with 10-Key, a number of supervisors and operators also mentioned that a number of the best 10-Key operators chose not to volunteer for DataHand®. While this is suggestive, further study would be required before drawing firm conclusions).
B) Other Systematic Differences. In an uncontrolled study, many other factors may be non-randomly associated with the DataHand® and 10-Key operators. For example, if DataHand® operators were to have received more "clean" than "dirty" mail to process, then this would account for some part of the DataHand® performance advantage. With a controlled study, the influence of factors like these could be eliminated.
Incentives in Context of Time-Saving Technological Change
Who benefits when an innovation introduces time savings, the employer or the worker? If employers cannot perfectly monitor their workers (as they almost never can), then workers might be tempted to take some (if not all) of the time-savings associated with a new production process in the form of leisure (longer breaks) or a slower pace of production. For the employer to recoup some part of the time savings requires some change in incentives, either positive or negative. For example, extra pay or some other "carrot" might entice workers to not consume the time saved by the innovation. On the other hand, closer monitoring on the part of the employer (a kind of negative incentive) might also reclaim the newly-found production time.
Incentives were not changed during the Rio Salado pilot study. Although only a controlled study design would allow us to confirm this, it is plausible that some portion of the DataHand®'s performance advantage might be being consumed by workers. If this process is at work, then this source of bias would appear to cut only one way: the performance advantage associated with DataHand® is likely to be understated in the pilot study. We will need to do a controlled study to gauge the magnitude of this effect.
DATA
This last section of the memo presents details of the data collection and analysis, and presents the evidence supporting the bullet point on page 1.
The Rio Salado site runs 7 FSM machines, each with four data input consoles which have been outfitted with both a 10-Key pad and a DataHand® unit (an A-B switch is used to switch each console from the 10-Key unit to the DataHand® and vice versa). Operators work in teams of six, four inputting data, while the other two load mail to be processed by the input operator. Operators rotate between loading mail and inputting; each operator may use either the 10-Key input pad or the DataHand® at their discretion.
Wendy Goodman (Phoenix In-Plant Support) provided me with data on FSM production. What is important to realize about these data is that they correspond to the machine's overall performance. That is, the computer that runs the FSM machines keeps track of the totals (e.g. numbers of items fed and accepted, runtime, downtime, idletime etc.) for each machine. Given this configuration, there is no way to identify an individual's performance in inputting data. The best we can get from this setup is the performance of the team working on each machine.
Ms. Goodman collected reports of daily production runs for each machine over a period of several months. She asked supervisors to note on these reports whether the production run was done by operators who used only 10-Key, only DataHand®, or whether members of the team used both 10-Key and DataHand®.
Ms. Goodman provided me with complete data (i.e., not a sample) on FSM production runs for 34 days for which she was confident that the data had been correctly collected and coded. These 34 days span the period from 11/5/96 to 12/13/96. The data are in the form of one spreadsheet per day. Columns in the spreadsheet describe various characteristics of performance (e.g., Total Items Fed, Total Items Accepted, etc.). There are separate rows in the spreadsheet for each of the 7 machines' runs, arranged by tour (3 tours a day), for a total of 21 rows of production run data. While the total number of production runs is potentially 714 (21 runs x 34 days), some machines did not run at all on some days. After eliminating rows for which the FSM machine did not run at all, these spreadsheets yielded a total of 699 production runs.
Of these 699 runs, 33.8 percent (236) were Barcode runs, done almost exclusively on second tour. Since neither DataHand® nor l0-Key pad were used for these runs, we have dropped them from the dataset. Supervisors did not note which keyboard was used in an additional 6.8 percent (4-8) of the runs. Supervisors reported that both DataHand® and 10-Key were used in 98 runs. We have also dropped these runs from the database. A total of 317 runs remain after eliminating the Barcode runs, mixed DataHand® 10-Key runs and runs for which we could not identify which keyboard was being used. Of these, 197 were done exclusively with DataHand®, and 120 with 10-Key.
I focused on three measures of performance in this study. Raw throughput is defined as Items Fed per hour of Runtime, Quantity-adjusted throughput is calculated as Items Accepted per hour of Runtime; Quality is the number of Items Accepted divided by the number of Items Fed. Our strategy is to measure the differences in performance between runs where the operators used DataHand® exclusively and runs where 10-Key was used exclusively.
Table 1 reports averages for the various performance measures by keyboard type and the percent of DataHand® runs over 10-Key runs. For all three performance measures, on average, runs with operators using only DataHand® outperform runs made by operators using only 10-Key. Moreover, all three of these contrasts are statistically significant (assessed via t-tests of differences in means).
Table 1 reports averages for the various performance measures by keyboard type and the percent in of DataHand® runs over 10-Key runs. For all three performance Measure on average, runs with operators using only DataHand® outperform runs made by operators using only 10-Key. Moreover, all three of these contrasts are statistically significant (assessed via t-tests of differences in means).
Table 1. Average Performance by Keyboard Type (Number of Runs in Parentheses)
| DataHand® | 10-Key | DataHand vs. 10K |
|
|---|---|---|---|
Raw |
6397.084 |
6070.338 |
|
Quality-adjusted |
6351.585 |
5978.970 |
|
Quality |
.99315 |
.99003 |
|
The final point I'd like to make concerns the relationship between throughput and runtime. Table 2 displays the relationship between DataHand®'s throughput advantages and length of runtime. For the small number of runs lasting less than 3 hours, on average, DataHand® underperforms 10-Key. DataHand® surpasses 10-Key in runs between 3 and 4 hours in length, and substantially outperforms 10-Key as runtimes become longer. Figures 2 and 3 plot the throughput numbers in Table 2. While throughput declines for both types of keyboards as runtime increases, the decline for 10-Key runs begins earlier and is more precipitous than for DataHand® runs. As I mentioned above, while we must be cautious in heeding the caveats discussed above, this pattern is consistent with an interpretation that DataHand® operators do not fatigue as quickly as 10-Keyers.It is also consistent with DataHand® operators' self-reports: all of the operators I met who use DataHand® claimed that they do not fatigue as quickly on DataHand® as they did on 10-Key.
Table 2. Average Throughput by Keyboard Type and Runtime (Number of Runs in Parentheses)
| Raw Throughput: | DataHand® | 10-Key | %
Increase Dh vs. 10K |
|---|---|---|---|
| <3 hours | 6138.408
(N=15) |
6363.070
(N=23) |
-3.5% |
| 3-4 hours | 6656.383
(N=19) |
6490.095
(N=19) |
2.6% |
| 4-5 hours | 6413.453
(N=39) |
6103.593
(N=23) |
5.1% |
| 5-6 hours | 6491.648
(N=86) |
5844.194
(N=47) |
11.1% |
| >6 hours | 6138.732
(N=38) |
5464.799
(N=8) |
12.3% |
| Quality-Adjusted Throughput: | |||
| Runtime: | |||
| <3 hours | 6070.903
(N=15) |
6303.293
(N=23) |
-3.7% |
| 3-4 hours | 6602.224
(N=19) |
6412.127
(N=19) |
3.0% |
| 4-5 hours | 6366.569
(N=39) |
5782.908
(N=23) |
10.1% |
| 5-6 hours | 6441.824
(N=86) |
5806.039
(N=47) |
10.9% |
| >6 hours | 6117.454
(N=38) |
5542.938
(N=7) |
10.4% |