Quantcast
Channel: Project Tools | Minitab
Viewing all 403 articles
Browse latest View live

Chi Square Analysis for Halloween: Is There a Slasher Movie Gender Gap?

$
0
0

Chi Square Analysis for a Frightening HalloweenChi-square analysis compares the counts of two categorical variables to tell you if a relationship exists between the variables or not. It's a tool with many applications in the world of business and quality improvement. For example, let's say you manage three call centers and you want to know if solving a customer's problem depends on which center gets the call. To find out, you tally the successful and unsuccessful resolutions for each center in a table, and perform a chi-square test of independence on the data. 

Practical applications like this are all well and good, but you can also apply chi-square analysis to answer important questions about factors in everyday life, and even about events like elections....or Halloween.

For example, there's been a lot of talk about gender gaps during this year's election, but I have yet to see anyone answer this Halloween-related question:  If you're a character in a slasher film, is there a connection between your gender and your dying in some horrible manner?  

Your gut reaction is probably, "Well, in every slasher movie, some crazed maniac chases a young woman around with a weapon. So there must be a connection." Anecdotally, that answer feels correct. But could you back up that assertion at a heated Halloween party? What do the data say?  Chi-square analysis can help. 

Categorizing Data about Deaths in Two Popular Horror Film Franchises

A quick web search revealed a site that helpfully offers (appallingly detailed) summaries of all of the deaths that have occurred in most major slasher and horror film franchises. I loaded the death data for the Halloween and Friday the 13th franchises into Minitab to see what I could learn. (You can download my datasheets here and here.) 

First, I needed to categorize the deaths: while the downloaded table includes helpful details about, for example, the startling range of weapons used in stabbings, for this analysis there's no real difference between being stabbed with a knife and being stabbed with a pitchfork. I also removed a few deaths for which a clear cause couldn't be determined.

Here are the five categories I divided the deaths into: 

  1. Stabbed (with anything)
  2. Blunt force trauma (including crushing and, in one case, a "bear hug")
  3. Vital parts removed (beheadings and such)
  4. Shot (by bullets, arrows, quills, and other projectiles)
  5. Exotic (like being cooked, frozen, or burned up in the atmosphere)
Tally and Cross Tabulation of Horror Movie Deaths

Next, I used Minitab Statistical Software to tally the victims' gender and type of death for each series by selecting Stat > Tables > Tally Individual Variables and selecting Counts and Percents. 

Here are the results for the Friday the 13th series: 

Friday the 13th Death Tally

And the Halloween series: 

Halloween Death Tally

We can see a couple of interesting things. First, the proportion of each category of death seems to be remarkably even between the two series; a few more characters die from being shot in the Halloween films, but otherwise the proportions are remarkably similar. 

But more eye-opening is the evidence that, in both franchises, the idea that victims are predominantly women appears to be wrong. Just 30 percent of characters who die in the Halloween series are women, and only 34 percent are women in the F13 series. 

That got me wondering whether the type of death inflicted on a character in these series was associated with the character's gender.  Exactly the type of question a chi-square test was made to answer!  

Chi-Square Test for Gender and Cause of Death in Slasher Film Series

I did the test on the Friday the 13th data first, selecting Stat > Tables > Cross Tabulation and Chi-Square in Minitab, and selecting "Gender" as the categorical variable for rows, and "Death Category" as the variable for columns. 

 Friday the 13th Chi-Square Analysis

The p-values for this test are high, at 0.158 and 0.144.  For this series, then, we can't conclude that there's a connection between gender and cause of death, at least not at the .05 significance level. In addition, the fact that two of the cells in our chi-square table had counts below 5 makes the analysis suspect. 

Let's take a look at the Halloween series:

Halloween Chi Square Analysis

The p-values for this test are low, at 0.043 and 0.031. However, even though the p-values support a connection between gender and cause of death in the Halloween series, the fact that two cells in this chi-square table also had counts below 5 makes this analysis suspect, too. 

But since the relative percentages of death types for men and women were pretty comparable across both series, what if we combine the cause of death data?  I did that in another data sheet, and ran the analysis again: 

Combined Slasher Movie Chi Square Analysis

The p-values are very low at .004 and .003, so we can reject the null hypothesis and conclude that there is an association between gender and cause of death for all of the characters across both of these series. If we run a cross-tabulation on this data to see the proportion of each type of death accounted for by gender, we get a clearer sense of how this association plays out. The Assistant in Minitab 16 does a very nice job of reporting these results visually: 

Percentage Profiles Charts for Cause of Death in Slasher Movies

We also can learn something from the Assistant's graphical display of the percentage difference between the observed and expected counts of each death type by gender: 

Percent Difference in Observed and Expected Counts of Death Types by Gender in Slasher Movies

 

It appears that men account for more deaths that involve blunt force trauma and being shot (82% to 18% for both). In fact, more men than women are killed in each of these categories, although the proportion of deaths in the "exotic" category is not as extreme as the rest. 

We can also look at this using Pareto charts of death category counts for each gender: 

Pareto Chart of Death Causes by Gender in Slasher Movies

It's clear that stabbing is the leading cause of death for characters in these films, regardless of gender. After that, however, the order of categories changes. It appears that men in these films are more frequently dispatched by blunt force trauma and the removal of vital parts, with more exotic exits ranking fourth. In contrast, the exotic methods rank second for female characters, followed by blunt force trauma and the removal of vital parts, respectively. 

Of course, this is really...erm...just splitting hairs, isn't it?  The bottom line is that regardless of gender, if you're a character in one of these films, you can count on something horrible happening to you.

Thank heavens it's only a movie, and happy Halloween!

 


Why Do So Many U.S. Voters Go MIA on Election Day? (Part II)

$
0
0

vote imageSo what stops eligible U.S. voters from showing up at the polls on election day?  I’m not going to be able to tease out all the possible factors associated with variability in voter turnout in one blog post. 

But you can often be forgiven in statistics if you clearly state at the onset that your main objective is a preliminary exploration rather than a final, conclusive analysis. So let’s explore.

Does temperature affect voter turnout?

Weather is sometimes suggested as a possible influence on voter turnout. Can you imagine the impact on the election if Hurricane Sandy had hit just one week later than it did?

But does it take the storm of a century to influence turnout? Might even more subtle weather variations, such as temperature drops, be associated with changes in turnout? 

To find out, I collected data on the mean election day temperature in the largest city of each state and the state’s voter turnout rate (using VEP) for each U.S. presidential election from 1980 through 2008. Here’s a Minitab Statistical Software scatterplot with a regression line based on data from two U.S. states:

scatterplot no groups

Look at the trend…it seems that a lower mean temperature on election day might be associated with a higher turnout. That makes sense, right? When you get cold, you make an extra effort to vote because those polling stations are sooo warm and toasty inside. Sometimes they even have free coffee or hot chocolate!

I hope you’re jumping out of your chair and pulling your hair out now.

Not just because I implied that correlation equals causation, not just because the data don’t hug that fit line closely, but also because by lumping data for two states (NC and MN) together on one scatterplot, I did something even more misleading than your average political attack ad.

To see why, look what happens when I display the same data using a categorical variable to differentiate the data for each state (choose Graph > Scatterplot > With Regression and Groups).

scatterplot with groups

Hmm…now we’ve got the opposite trend shown by the same data! Lower temperature seems to be loosely associated with lower turnout. Which scatterplot would you vote for—with or without groups?

In data analysis, beware of influence from special interest groups

Why does overlooking a group effect make such a big difference in these trends?

First, the predictor. Having lived in both states, I can say that a 50-degree day in November is a much different input in North Carolina than it is in Minnesota. And it's likely to have a different effect: Minnesotans might get out their beach towels; North Carolinians might get out their down jackets.

Second, the average turnout response for each state over this period is very different: about 72% for Minnesota, about 51% for North Carolina. Conglomerating the data without using a grouping variable masks the real trends in the relationship between temperature and turnout rate.

In this case, it was clear the data were from two different states at the onset. But a hidden group effect can be much more insidious, because you might not even be aware of groups in your data.

For example, suppose your company tracks the defective rate of a product in response to different inputs. If the data is from different facilities with different conditions that uniquely affect the inputs, with different historical defective rates, you could fall into the same trap if you analyze all of the data without considering a grouping variable for facility.

One thing that should make you suspicious that hidden groups might be lurking beneath the surface is the presence of separate clusters of data in the scatterplot—as you can see in the first (top) scatterplot above.

How can I standardize data to account for a group effect?

So I have a group effect. Do I have to display 50 separate scatterplots, one for each state, to evaluate the voter turnout vs mean temperature data for the U.S.?

That sounds like a hassle. And it doesn't give me nearly enough data in each plot to confidently identify a trend. Any outlier (like the high value in the scatterplot for North Carolina) will have an unduly large influence on the regression line.

There's another option. To account for the differences between the two groups, I can use Minitab's Calc > Standardize tool to standardize the data. For each location, I choose the option to subtract the overall mean temperature on all of the election days from the temperature on each election day. That gives me a measure of how abnormally cold or warm it was on that day. Similarly, for the response, the overall mean VEP turnout rate on all of the election days is subtracted from the turnout rate (VEP)  for each election day to indicate whether the turnout was relatively low or high for that year.

After the data is standardized to account for differences between the states, here's what it looks like on a Minitab scatterplot:

scatterplot  of normalized temps

Now the trend for the combined data is consistent with the trend for each group. An increase in temperature from the mean seems to be associated with an increase in voter turnout. Notice the two separate clusters of data are gone.

It's still not much data to go on. But if I standardize the data for every state to account for the group effect, I can display the data for all 50 states on one scatterplot to examine the trend.

Unfortunately I don't have time to do that right now; it's election day and I need to check the outside temperature.

It won't affect whether I vote. But it might help me decide whether to grab a beach towel or a down coat before I head to the polls.

Note: If you'd like to follow along and create these graphs in MInitab, the data sets for this post are here.

Managing Diabetes with Six Sigma and Statistics, Part I

$
0
0

Controlling Diabetes with Six Sigma and StatisticsAccording to the American Diabetes Association, nearly 26 million children and adults in the U.S. have diabetes, and another 79 million Americans are at risk for developing Type 2 Diabetes.

Type 2 Diabetes is the most common form of the disease, and it occurs when either the body does not produce enough insulin or the cells ignore the insulin. Insulin is important because it allows the body to use glucose (which the body makes by converting sugars and starches in foods) for energy, and it actually moves the sugar from the blood into the cells to be used. When glucose builds up in the bloodstream instead of going into the cells, it can lead to life-threatening complications if left untreated.

While there’s no cure for diabetes, the disease can be managed by eating well, exercising, and using diabetes medications to regulate blood sugar.

When my colleague Eston referred me to Bill Howell, a quality professional who was recently diagnosed with Type 2 Diabetes, I got to learn firsthand how diabetes can be controlled successfully. And because this is a quality improvement blog, you can bet that Howell chose to manage his disease with nothing other than Six Sigma and statistics!

Getting Acquainted with Diabetes

Howell had a hunch something was wrong—he was always thirsty, and he began to suffer leg cramps and vision loss. A diabetic coworker encouraged him to test his blood glucose level, and he found his results were so far from the norm that the meter couldn’t even calculate an exact result. A few days later a doctor diagnosed Howell with Type 2 Diabetes, and offered him recommendations for diet, medications, and a plan to regularly monitor his blood sugar with blood tests.

Since he managed many quality improvement projects in his job, it was easy for Howell to to think of his symptoms as defects that could be eliminated by completing a Six Sigma project.

He divided his diabetes plan into Six Sigma’s five DMAIC phases (Define, Measure, Analyze, Improve, Control), and chose to rely on his project “sponsor,” his physician, for guidance. He began by defining the problem he needed to solve, the impacts he sought to lessen, and crafted a goal statement. Since his symptoms corresponded to high blood glucose, he wanted to bring his levels below 125 mg/dL, which is a normal target level. To reduce symptoms naturally and curb dependence upon medication, Howell also wanted to strictly follow his doctor’s recommended plans for diet, medication, and blood testing.

Collecting the Data about Diabetes Symptoms

Tracking daily blood glucose levels was a key metric in understanding his disease, so Howell created a data collection plan to sample his blood three times per day. He charted his data using dotplots in Minitab (Graph> Dotplot), which allowed him to see how his glucose levels changed over time:

Ensuring Good Measurement Systems

As a Six Sigma practitioner, Howell also knew he needed to verify that his blood glucose measurements were reliable. To make sure his glucose meter produced valid results, he followed the manufacturer’s weekly calibration procedure and recorded the calibration results over time. By graphing his results, he confirmed that his values fell within the manufacturer’s calibration limits.

But he still wondered about the potential effects of drawing blood from different locations on his test results. Howell’s doctor encouraged him to draw blood only from his fingertips to control for any variability due to location of the testing site. But did it matter which finger he drew the blood from?

To find out, he created a randomized pattern of numbers from 1 to 10, assigned each of his fingers a number, and then tested them in the randomized order. Howell recorded the glucose levels for each finger tested and charted the results using a dotplot. The plot revealed that groupings of test results shared the same random pattern, which suggested that finger selection did not impact the test results.

He also performed a One-Way Analysis of Variance (Stat> ANOVA> One-Way) to offer further statistical proof of equality between fingers. The analysis aligned with the dotplot findings, and revealed no evidence that a given finger will impact the test outcome.

Now that Howell was confident his measurement system produced valid results, he wanted to assess the potential causes for his increased blood glucose levels. He created a cause-and-effect diagram in Minitab (Stat> Quality Tools> Cause-and-Effect), and chose to focus on causes he could control and analyze himself, such as diet and exercise. Pinpointing causes helped him plan for the next step of his project—recording his daily food intake.

Stay tuned for Part II to find out how Howell organized and analyzed several months of daily diet and blood glucose data, and what he found out that helped him keep his diabetes in check.

For more information on diabetes, a great resource is the American Diabetes Association web site: http://www.diabetes.org

Diabetes image used under Creative Commons Attribution ShareAlike 3.0 license.

The Care and Feeding of Capital Equipment (with Reliability Statistics)

$
0
0

In industry, it may be presses and CNC machines. In education, it may be computers and microscopes. In retail, it may be the forklifts in the warehouse. 

Reliability analysis can help detect when to replace equipment - even on a submarineRegardless of your industry or the size of your business, in all likelihood you have one or more pieces of capital equipment that must be regularly maintained.  Typically these expensive pieces of equipment come with routine maintenance plans.  But what happens when, perhaps after a few years of running smoothly, the equipment starts showing signs of wear?  Perhaps it starts with just one small fix and all is well. Then it requires a few more unscheduled repairs in succession. Suddenly you are staring at the possibility of having to replace the entire system…and replacing the equipment was definitely not in this year’s budget!

Before this happens to you, consider using Minitab to help track these unscheduled maintenance repairs and estimate when you have to consider replacing the system. This type of analysis can be found in Minitab’s Reliability tools, using the Repairable System Analysis menu item (Stat > Reliability/Survival > Repairable System Analysis).

To demonstrate how we can monitor unscheduled repairs on a system, let's look at an example from Minitab’s Reliability training course. Sailors on the U.S. Navy submarine USS Grampus perform routine maintenance on its four diesel-electric engines. They want to monitor the unscheduled maintenance on the engines over time and determine whether the repair costs are increasing over time. To do this for one specific engine, the sailors record the time of failure, in hours, for each of 56 failures, and include the labor and part costs for each repair.

In the Event Plot graph of the Unscheduled Repairs below, we can see that the data points appear to be randomly scattered along the event plot line. This pattern indicates that the failure rate for the engine is constant over time.  The engine is not significantly deteriorating yet.

Event Plot

Even though the USS Grampus’ engine is not yet in full wear-out failure mode, when we evaluate the Mean Cumulative Function graph with cost, we can see that repair costs appear to increase rapidly at around 14,000 hours. The repairs for this engine are becoming more and more expensive.

MCF Plot

Finally, we can apply a parametric growth curve to the unscheduled maintenance time data. Because the failure rate is constant, we use the homogeneous Poisson process to get an estimate of the Mean Time Between Failures (MTBF).  In this case, the MTBF for the USS Grampus’ engine is 285.714 hours.  This indicates that, on average, unscheduled maintenance is required approximately every 285 hours. If, in the future, we see that interval decrease, we can assume that the engine is entering its wear-out failure mode and consider budgeting for its replacement.

MTBF

Just as the engineers on the USS Grampus are able to estimate the time between each unscheduled maintenance event and identify rising costs associated with those repairs, you too can use Minitab Statistical Software to monitor repair work and prepare for the inevitable replacement of your business’s capital equipment.  No crystal ball needed…just observational data and Minitab!

Already have Minitab but want to learn more about Reliability statistics?  Consider attending one of Minitab’s upcoming Reliability public training sessions. Join us!

Statistics Is an Art

$
0
0
Statistics is an art.When you say the word "statistics" to people, most think about math. But not me. I think about art.
 
I was reminded about this when I received the following e-mail recently: 
Hi Eston --

I read your paper on "Weather Forecasts: Just How Reliable Are They?" and have a few questions. I am a retired engineer who enjoys data and would like to do a similar study for my home town. I want to look at the NWS and our local TV stations' 7 day forecasts (high, lows and precip.).

I was wondering if you have more details on the statistical analysis. I did take statistics for engineers in college in the 1960s, but have forgotten almost all I learned. I also did some statistics at work, but we had statisticians to do the hard stuff.

I enjoyed your blog "How I Learned to Love Statistics." I noticed you said "I gravitate to words, not numbers. In school I was the kid completely unfazed by William Faulkner and James Joyce." All our tacticians were even more nerds than us engineers and never entertained reading literature. When you said "I've overcome fear of statistics and acquired a real passion for it. And if I can learn to understand and apply statistics, so can you," it reminded me of understanding how to calculate the numbers, but not really knowing how or when to apply a certain test. I do remember my teacher saying that she wanted us to understand what the numbers meant and to hire a statistician to set up tests and provide guidance. I believed her.

Barry

A lot of people have asked about this article, I think because it's fun to take real-life data about things like the weather and use it to better understand how things work. Barry's e-mail gave me an opportunity to go back and re-read this article, which I co-wrote with my colleague Michelle Paret last year.

Most of the analyses we did in the weather article are very straightforward in Minitab: you just go to the menu item, enter the appropriate columns of data into a dialog box, and press "OK."
 
For instance, to create the time series plot of actual vs. forecasted temperatures shown in the article, we just selected Graph > Time Series Plot in Minitab. That brings up this dialog box, which lets us choose what type of Time Series Analysis we want to show:
 
Time Series Analysis Selection
 
Since we were looking at how four different measurements compared, we wanted the "Multiple" option. That brings up a dialog box in which we just select the four columns that contain our measurements and set the appropriate time scale. 
 
Time Series Analysis with Multiple Variables
 
When you hit "OK," Minitab performs the analysis and returns the graph shown in the article:
 
Time Series Plot
 
Behind the scenes, Minitab does all the mathematical heavy lifting, so there are no formulas to enter or anything of that sort. Minitab let us focus on the results of the analysis, rather than the calculations.

Barry's note made me think about my personal relationship with statistics and data analysis, and how statistical software has facilitated that relationship. My studies have emphasized applied statistics, and while I've learned the theoretical and mathematical underpinnings of the analysis, that's not where my interests lie.That's why statistical software is such a boon. Again, it does the mathematical heavy lifting so we can focus on what the analysis actually means.

Just as a composer might hear birdsongs and use a palette of timbres, tones and tempos to create a work that casts a new reflection on what she heard, a statistician needs to be able to look at the data available and use the palette of tests, transformations and graphs to extract illumination from those numbers. Moreover, a statistician needs to be able to understand not just what can be done with the data already in hand, but also what data could be collected. 
 
Statistics confounded me for years because I kept getting hung up on the math. What finally got me over that hurdle was realizing that while statistics is certainly a branch of mathematics, it's also an art. The calculations need to be done, but you can use different tools to do them. And like anything else, some tools are better than others. You could do an analysis with just paper and pencil, but that would be like driving cross-country in a soapbox racer. In contrast, using software like Minitab to perform the analysis is more like flying cross-country in first-class: fast and relatively painless.

In other words, once I realized better tools were available to me, the art involved in statistics became much more apparent, and I became much more interested.
 
There's nothing that can't be analyzed with statistics; the art is in finding the metrics that let you do it in a meaningful way.

Barry's comment about learning the calculation without understanding when and why you'd use it really resonates with me. The fact is, unless you're analyzing data every day, you will forget what those formulas are and which tests do what. That's one major reason we added the Assistant to Minitab 16, to provide an interactive tool that guides people to and through the right type of statistical analysis, and then helps them interpret what the analysis means.

If you'd like to explore the art of statistics, and you don't already have it, you can try Minitab 16 free. It's the complete, full package, and will work for 30 days. If you do try it, please let us know what you think. We love to get feedback from the people who use our software. 
 

 

Managing Diabetes with Six Sigma and Statistics, Part II

$
0
0

In my last post, I discussed how quality professional Bill Howell chose to manage his diabetes diagnosis by treating it as a Six Sigma project. Since a key metric in controlling his disease was keeping his blood glucose levels below 125 mg/dL, he tested his blood using a meter three times per day and then charted his data to see how his levels changed over time. He ensured his measurement systems were producing reliable data, and assessed the potential causes for his increased blood glucose levels. He then chose to focus on the causes he could control and analyze himself.

Tracking Daily Calories

One of these causes was his unhealthy diet. To minimize his symptoms, Howell’s doctor recommended following a daily 1,800-calorie diet, which included 50 grams of fats and 200 carbohydrates. Using bar charts with reference lines showing daily limits, he tracked each day’s total calories, fats, and carbohydrates. The charts helped him keep his diet in check, and showed him where making diet changes might help him meet other project goals, such as keeping his cholesterol down.

Analyzing the Data

After recording and graphing several months of daily blood glucose levels and diet information, Howell analyzed his data to identify sources of variation. To determine if his three daily blood tests produced the same average level, he ran an ANOVA (Stat> ANOVA> One-Way) in Minitab.

The results revealed that the evening blood sample average, taken before dinner, was statistically lower than the morning and night average readings. The analysis also suggested that the evening reading was more uniform, because it had a lower standard deviation than the other times of day.

Howell also wanted to identify how process inputs (calories, fats, carbohydrates, and pills consumed) affected his process output (blood glucose levels). A four-panel scatterplot (Graph> Scatterplot) revealed a clear relationship between the number of blood glucose-lowering pills consumed and blood glucose levels. The plot shows that it took about 30 pills for Howell to reach target levels of around 100 mg/dL.


Relying on Control Charts

To identify gaps between current performance and goal performance, Howell used Control Charts (Stat > Control Charts) to graph his diet, pill intake, and glucose levels in relation to predetermined upper and lower bounds. If his data fell outside of the bounds, Howell knew that his process changed, and he could adjust accordingly.

The Individuals Control Chart (Stat > Control Charts> Variables Charts for Individuals> Individuals) below shows the total calories Howell consumed in a two-month span. The chart reveals a stable process and shows that he met his caloric intake requirements the majority of the time, with the exception of one data point falling above the upper control limit (UCL). On this day, Howell consumed more calories than his target caloric intake of 1,800 calories, so he ate fewer calories the following day.

Howell also used Xbar-R Charts (Stat > Control Charts> Variables Charts for Subgrops> Xbar-R) to evaluate the spread between the three daily blood glucose test results (lower chart) and the average of test results for each day (upper chart). Both charts show medication level (either 1 or 2 pills per day) in relation to time.

Under Control?

Howell’s approach to managing his disease was very thorough, but how well did he meet his key performance objectives? By tracking his dietary intake and following his doctor’s prescribed diet, exercise, medication, and blood testing plan, he brought his daily blood glucose level down to the 125 mg/dL target level. Just two months after starting the project, his long-term process level average in December was several points below the target, at 116.3 mg/dL.  

To make sure his blood test results fell into the predefined specification limits (70 mg/dL – 150 mg/dL), Howell used Minitab’s Process Capability Analysis (Stat> Quality Tools> Capability Analysis). He found that 97.85% of his test results met the criteria for success.

Process Capability Blood Glucose Reading

By continuing to follow his process, Howell eventually weaned himself completely from the medication he initially took to lower blood glucose levels. By August 2010, he was able to maintain stable blood glucose levels without any pills. He attributes this change to controlling his diet and recording and charting everything he ate.

Howell says he is healthier than he’s been in years! He’s dropped nearly 45 pounds and has seen an almost complete reduction in all of his symptoms, including dry mouth, blurry vision and inability to sleep.

Have you ever used Six Sigma methods to manage a health problem?

For further reading, check out the full article on Minitab.com: http://www.minitab.com/company/news/news-basic.aspx?id=11292

Howell’s complete strategy for managing diabetes is detailed in his book, I Took Control: Effective Actions for a Diabetes Diagnosis.

Attending the Preliminary Round Judging of the ITEA

$
0
0

ITEA awardWe got the opportunity a few weeks back to sit in on one of the preliminary round judging events that make up ASQ’s International Team Excellence Award Process (ITEA). I wrote about the ITEA in a previous post, but the process is an annual event that celebrates the accomplishments of quality improvement teams from a broad spectrum of industries.

Eston and I traveled to Charlotte, N.C., to attend the first of seven rounds of preliminary judging that take place each year throughout October and early November. Judges working in the quality improvement field from around the world are selected by ASQ and then convene in one of the cities where preliminary round judging is held. The judges meet for two days. The first day consists of training for judges, and the second day is the actual judging of team presentations.

The ITEA Judging Process

Quality teams from all industries and a wide range of companies submit their presentations for judging in early September, but don’t actually deliver their presentation live unless they are selected as team finalists for the final round of judging at ASQ’s annual World Conference on Quality and Improvement.

The 2011-2012 team finalists saved millions of dollars for their companies and showcased exceptional projects and processes (and many were Minitab customers)!

In Charlotte, the training for judges was conducted by two quality professionals with long-time involvement in the ITEA:  Patti Trapp, current ITEA chair and senior systems applications engineer at Mercury Marine – Brunswick Corporation, and Barry Bickley, director, manager data support for Bank of America. Geetha Balagopal, who has been the ITEA’s Administrator since 1996, was also in attendance at Charlotte.

The first day of the preliminary round is all about preparing the judges for scoring the teams the next day. The scoring is based upon how well individual team presentations address the 37 predefined criteria established by ASQ.

The ITEA criteria, as well as the process for judging, has been developed and refined for more than 25 years by stakeholder groups ranging from quality improvement team members to professional project coaches and managers. The criteria exhibit current best practices for quality improvement teams, with five sections that cover the entire cycle of an improvement project.

Reaching Consensus

Judges must reach consensus among their panel, and this is where things got interesting! It was fun to watch the judges express opinions on why they thought a team did or did not meet a specific criterion. This process seemed to teach everyone involved (me included!) a lot about current best practices in quality improvement and data analysis.

“It’s good for the judges to disagree,” says Bickley. “Divergent opinions yield the best results for teams.”

After scoring, judges create detailed feedback reports for the teams, which help the teams enhance future ITEA presentations, as well as improve their respective quality improvement journeys.

The team presentations give great insight into what kinds of quality tools and methodologies teams are using to achieve success for their companies, making for a meaningful learning experience for all the teams.

“The ITEA process provides an outstanding opportunity for teams from diverse organizations around the world to come together and share their knowledge using a variety of quality processes,” says Trapp.

It’s a particularly exciting time for the ITEA because this year's process attracted a record number of team entries.

“It’s been great to watch the process grow over the years,” says Balagopal. "The dedication and support of our volunteer committee members, trainers, site coordinators and judges has been instrumental to the growth and success of the ITEA process."

At Minitab, we are thrilled to be sponsoring this year’s ITEA, and want to extend our thanks to Patti, Barry, and Geetha for inviting us to Charlotte. Good luck to all the teams who have entered!

The ITEA mission is "To support ASQ in making the ITEA Process THE world recognized program for benchmarking excellence in teams." In this endeavor, we want to continue enhancing our primarily Volunteer organization with talented, enthusiastic professionals with experience in the breadth of quality methodologies.

There are a variety of ways to become involved in the ITEA process, from judging to helping out in any one of the subcommittees that focus on the overall ITEA process, training, or criteria management. For more information, please visit http://wcqi.asq.org/team-competition/index.html or contact Geetha Balagopal at gbalagopal@asq.org.

Minitab’s Lean Thanksgiving

$
0
0

Use Quality Companion for process mapping (or for drawing a turkey!).Do you celebrate Thanksgiving or other major holidays at work? Maybe you have a potluck lunch a few days before the actual event, with your department or team?

At Minitab Inc. we celebrate Turkey Day with our annual “Minitab Thanksgiving” potluck lunch. But this just isn’t any old potluck—it’s like a potluck lunch on steroids! Employees volunteer weeks in advance to the plan the event, which needs to feed more than 200 Minitab employees during the 12-1 p.m. lunch hour.

The turkey, gravy, mashed potatoes, and stuffing are catered, but all of the side dishes and desserts are cooked and brought in by Minitab employees. The event always yields an impressive spread.

What’s not so impressive is the time employees have had to wait for their chance to hit the buffet. However, this year’s event had a flawless execution. Thanks to a combination of scrum and quality improvement techniques, the time hungry employees spent in line was greatly diminished compared to past years!

Planning the Feast

Using scrum, an agile software development method for managing software projects, the Minitab “pilgrims” set up a Thanksgiving setup task board to lay out all the tasks that needed to be done, as well as their statuses.

The scrum board for the Thanksgiving lunch setup.

Major tasks included setting up tables for employees to dine at, as well as serving tables for appetizers, sides, drinks, and desserts. Minor tasks included ensuring that there were adequate numbers of plastic utensils and salt/pepper shakers at each table, creating and hanging signs to direct employees to line up in the correct area, and making sure desserts were cut and had utensils for serving.

The setup of the food tables, buffets, and line formations were laid out prior to the big day in a Quality Companion process map:

Quality Companion Process Map

Mapping the Value Stream

The seeds of this year’s success were rooted in earlier challenges. Thanksgiving organizers created a current-state Value Stream Map (VSM) in Quality Companion after last year’s lunch to show them where they needed to focus their improvement efforts. Value stream maps are a key tool in many Lean projects because they illustrate the flow of materials through a process, and make it easy to identify waste.Quality Companion Value Stream Map

The current-state VSM for the food line setup shows a high lead time, which was attributed to having only one main buffet line for the turkey and fixin’s. With this in mind, they created a future-state VSM that added two additional buffet lines:

Quality Companion Value Stream Map

Thanksgiving organizers tested out the future-state VSM with this year’s lunch. With the two additional lines added, the lead time was cut almost in half!

The Process in Action

The main line feeds into the kitchen door, and splits off into three buffet lines for turkey and all the fixin’s. Pictured below is the first line. On the other side is another buffet, with lines on each side.

turkey buffet line

The three main buffet lines feed off into two lines that move into the dining area, where a table is setup with the Thanksgiving side dishes:

side dish buffet line

Everyone is then ready to take their seats in the main dining area. Here’s a look at the table setup:

table setup for dining area

Planning for Next Year

Even though this year’s event left employees giving thanks for “the shortest time we’ve ever spent in line,” the event organizers are striving to keep quality improvement continuous. They are already starting to brainstorm improvements for next year’s feast!

If you’d like to explore the process improvement tools in Quality Companion, give it a try free for 30 days by downloading the trial version at http://www.minitab.com/products/quality-companion/free-trial.aspx. Let us know what you think!

Have you used lean or quality improvement tactics to optimize processes in your kitchen?

Happy Thanksgiving from all of us here at Minitab!


My Work with Minitab

$
0
0

The Minitab Fan section of the Minitab blog is your chance to share with our readers! We always love to hear how you are using Minitab products for quality improvement projects, Lean Six Sigma initiatives, research and data analysis, and more. If our software has helped you, please share your Minitab story, too!

Letter Throughout my 15 years as a Six Sigma Initiative Leader, Consultant, Trainer, Black Belt and master Black Belt I have been enthusiastic about Minitab Statistical Software, starting from release 11 to now.

Minitab's graphics are outstanding in their ability to present messages and conclusions with visual clarity. They are easy to use and produce excellent PowerPoint images that explain and help leaders understand statistical results and conclusions. As a consultant and an employee, I often received feedback from leaders that Minitab charts and graphs are aesthetically-pleasing, and that they can convey results either at a plant leadership or executive level.

The intuitave interface assists the user in creating control charts, graphical summaries and capability plots so that everybody can understand visually how well or poorly a process is performing. This graphical capability makes Minitab an amazing tool for improving processes.

As Polaroid Corporation’s quality strategy manager for product development and worldwide manufacturing, I ran 7 waves of Black Belt Training and one master Black Belt class using Minitab 11, first using individual copies for black belts and then under a corporate license. In my career I also was a senior consultant, Master Black Belt, Project Manager and Design for Six Sigma (DFSS) Voice of the Customer Specialist from Sept 1999 – Sept 2008.

I then joined  Celerant Consulting as a technical consultant from Oct 2008 – May 2011, where my work using Minitab with a team at Life Technologies solved a ten-year problem. I worked at Life Technologies from June 2011 through July 2012, mentoring a number of individuals on ANOVA, multiple regression models, Gage R&R (including expanded Gage R&R), Gage Linearity and Bias, Type 1 Gage studies, Variables and Attribute Control Charts, Capability Analysis, etc.

Those involved with Mintab's development have done an excellent job anticipating customer needs! Of particular interest to users and leaders is the reduction in work made possible by Minitab's work-saving featues, such as graphs that update automatically with changes to data or automating routine analyses with a macro.

In addition, Minitab's level of technical support depth is outstanding. Not only do they really know Minitab, they know statistics. Minitab has removed the statistical computing barrier so that non-statistians can concentrate on applying statistical methods.

Minitab's trainers like Paul Sheehy also deserve praise. Typical of the comments I've heard is this one from my colleague Sean Paige, a senior manager for manufacturing (and I quote):

I thought the first classes were great. In fact, when I completed the evaluation I could not make a single suggestion to improve the class. Well managed, great content, great instructor, tailored content.

I wanted to take this opportunity to thank Minitab for all of the help and technical guidance that I have experienced in working with Minitab from releases 11 thru 16. I have genuinely enjoyed my relationship with Minitab.

Joe Kasabula
Consultant, Business Excellence
Bedford/Framingham Manufacturing
 

Share your Minitab story!

Beyond the "Regular Guy" Control Charts: An Ode to the EWMA Chart

$
0
0

It's no secret that in the world of control charts, I- and Xbar- are pretty much the popular kids in school.  But have you ever met their cousin EWMA? That's him in the middle of the class, wearing the clothes that look nice but aren't very flashy. You know, when Xbar- and I- were leading the championship football team last month, EWMA won the state tennis championship?  I didn't go either -- pretty much only the player's parents go to tennis matches -- but I heard that he won it.  Someone told me he even got a scholarship to an Ivy League school to play, not that he needed it with his grades and great SAT score. 

Me? I'm going to the state university. I-, Xbar-, and I are renting a house and we're going to sublet the basement to MR- and R-.

You know, I just realized that if EWMA would just meet more people, they'd probably realize he was ever bit as capable and maybe even smarter than I- and Xbar-!

The "Regular Guys" of Control Charts

Let's call the I-Chart and Xbar-Chart "The Regular Guys."  You probably use The Regular Guys almost every time you make a control chart, without much thought as to what they are built to detect. The Regular Guys are really good charts and great for many situations: a sudden shift in process or an errant point in particular are likely to get captured as special causes fairly quickly.

But for a critical process, or one with mediocre quality to begin with, even a subtle shift in the process can have serious implications. That's where our friend the EWMA Chart comes in.

Suppose we want to track annual inflation, a good example of something that we want to capture subtle shifts in reasonably fast. First, let's plot the data (since 1983 when we last ended a "special cause" period) on an I-Chart:

I Chart of Inflation

We nearly get a signal for an unusually high point around 1990, and we get a signal for an unusually low point in 2009.  You may be looking at the chart and thinking, "It looks like things were maybe a little high during the 1980's and a little low during the 1990's, but maybe that's just random variation".

Enter the EWMA Control Chart

EWMA stands for "Exponentially Weighted Moving Averages." What does that mean, exactly? Well, the EWMA Chart uses each data point and all prior points to form the plot (so point 4 on the plot uses information from points 1-4, and point 19 uses information from points 1-19), and gives the most recent point the strongest weight. 

You can create EWMA control charts in Minitab by going to Stat > Control Charts > Time-Weighted Charts > EWMA...

Without getting caught up in all of the mathematical details, the result is that subtle but sustained shifts are made much clearer, as you can seen in this EWMA Chart of the same data:

EWMA Chart of Inflation

Now it is much easier to see that we did in fact have a sustained period above the mean during the 1980's, as evidenced by the increasing values in points, followed by a period below the mean during the 1990's and stability during the 2000's...until obvious economic factors shake things up more recently.

Where Could You Use the EWMA Control Chart?

The same concept could apply to numerous metrics you're tracking at your organization. Although a signal on an I-Chart may let you know of a sudden, extreme shift or event, the EWMA will likely reveal if the process has drifted slightly from being centered (assuming it was centered to begin with) and product quality has deteriorated.  Think of where you might apply this tool on a chart you are already using, to learn more from your process!

Using Statistics for Process Improvement Outside of Manufacturing

$
0
0

HealthcareIt’s common to think that process improvement initiatives are meant to cater only to manufacturing processes, simply because manufacturing is where Lean and Six Sigma began. However, many other industry segments, such as healthcare and banking, also rely on data analysis and Lean Six Sigma tools to improve processes (even if those processes are more service-based).

For example, it’s increasingly common for healthcare professionals to conduct projects to help them investigate and understand certain clinical outcomes for patients, such as the incidence of a certain disease developing after surgery. And in the financial industry, there’s an increased focus on improving internal procedures, such as the processing of customer payments.

Here are a couple interesting uses of Minitab Statistical Software outside of manufacturing that I’ve come across in recent conversations with customers.

Preventing Pressure Ulcers

Pareto chartPressure ulcers, or bed sores, can be inflicted upon hospital patients who must stay in bed for a long period of recovery time. The ulcers can cause further injury to patients, as well as impact the length of stay and cost of care.

As part of an initiative to eliminate preventable harm to its patients, one large hospital system planned an improvement effort to lessen the incidence of hospital-acquired pressure ulcers.

To identify which hospitals in the system should be the focus for improvements, practitioners created a Pareto chart in Minitab to show the frequency of pressure ulcer occurrence across the system. They were able to conclude that a large percentage of all pressure ulcers were occurring in four of the ten hospitals.

To drill down even deeper, they followed up with another Pareto chart to help identify which units within the hospitals had the highest incidence of pressure ulcers. The Pareto Charts helped the hospital system understand their situation, which made it easy for them to develop a targeted solution and implement it.

Understanding Volume of Customer Payments

Occasionally, large business customers who borrow money from major financial institutions Control Chartsbecome concerned that their payments are not being processed on the day they are received. These payments are larger dollar values (read – much, much higher than our typical monthly home mortgage payments and are paid much more frequently), and the customers want to be sure their payments are being processed  in a timely fashion and their accounts are being updated correctly.

When this occurs, the institution can show the customer graphically their historical volumes of payments over time. With control charts in Minitab, the institution can illustrate the customer’s influx of payments processed for each day of the week over the course of a year. The charts can show them what “normal” really is for their volume of payments – and they are often amazed by the variation!

Learning from Others

Learning about the use of quality improvement tactics across different industries can introduce you to new tools or methods that also might work well for your business. For example, I was recently at a conference where a session described how one hospital identied gaps in patient safety using the same best practices employed by federal and international aviation administrations to keep passengers safe.

To read how Minitab customers from across a multitude of industries have solved problems and improved quality, check out our Case Studies and Testimonials page.

Do you work in an industry outside of manufacturing that also uses process improvement techniques?

Making My Statistical Software Match My Needs

$
0
0

When I want to analyze data, I want my statistical software to give me the options I want, when I want them. I want a menu that's perfectly suited to my needs. Maybe even a toolbar that gives me instant access to the tools I use most frequently. 

That's not too much to ask, is it? 

Look, you can't argue with nature. I'm a cat, which means I want things my way. If my human puts something (like a computer keyboard) right where I planned to take my mid-early-afternoon nap, I'm going to lie down there anyway and make him type around me. As the photo below and a bunch of others on this webpage prove, you cats out there know just what I'm talking about:

 I'm not moving until you customize my statistical software!

Fortunately, the statistical software my human uses is very easy to customize, so I can access the analyses I need exactly when I need them. That makes me purr.

A Set of Statistical Tools Made Just for Me (or You)

customize statistical software menuOne of the things my human is always going on about is Six Sigma and quality improvement, and how we can use data to improve ourselves and the things we do. As a cat, I don't have much room for improvement myself...but I do like to make sure my human is performing his vital functions (like filling my food dish) at peak performance. 

So I made myself a menu in Minitab that includes the tools I use most frequently. It was shockingly easy. I just right-clicked on the menu bar and selected the "Customize" option. 

That brought up the dialog box shown below. Selecting the "New Menu" command and dragging it from the "Commands" window to the to the menu bar created my menu. 

From there, I was able to rename my new menu "Marlowe." I then began dragging and dropping the tools I use most frequently, including contour plots, time series plots, and the fishbone: 

customized statistics menu

Cool, huh?  As you can see from the self-portrait I created to symbolize the Two Distributions... tool, you can even customize your icons. 

And that's just the beginning. There's a lot more you can do to customize Minitab to suit your needs, including creating custom toolbars and even creating individual profiles if, like me, you have to share your computer with humans. 

I'll talk about how different user profiles can help you co-exist with a human in my next post.

Two-Tip Tuesday: Using Minitab Output in Presentations and Reports

$
0
0

After you’ve done the work to analyze your data in statistical software such as Minitab, it’s likely that you’ll want to organize your output and results into a presentation for colleagues or clients.

You want your presentation to look great, so check out these tips below for ensuring your graphs look as good in your presentations as they look in the software:

1. Sizing Minitab Graphs for Presentations

Minitab graphs can be too large when you import them into a Word file or PowerPoint presentation. While you can click and drag a graph image to make it fit the layout, this method can distort the image. Here’s how to size the graph without distortion:

  • Set the appropriate size in Minitab before you copy the graph. Minitab automatically redraws the elements of the graph (such as the title, symbols, etc.), so that they don’t become distorted. Just double-click the graph region (in Minitab) and click the Graph Size tab to change the dimensions:

Dialogue Box

  • When you create a report in PowerPoint, use its Best scale for slide show option to optimize the size of a graph at a given screen resolution. Right-click the item (within PowerPoint) and choose Size and Position, and then check Best scale for slide show.
2. Maintaining Image Clarity

In Minitab, the default graph displays are always clear. But, when you shrink and resize graphs to fit them into your report, they may not look as crisp. Here are two solutions to maintain clarity:

  • Resize critical elements in Minitab before copying. Enlarging the font size of detailed tables and labels can keep them legible in situations where a graph must be dramatically reduced in size.

Minitab GraphsThe text in the original graph (left) was enlarged beyond its default size (right) within Minitab before copying over to Word or PowerPoint.

  • Use a file format that maximizes clarity. Though resizing by clicking and dragging within your report isn’t recommended, you can minimize its negative effects with one of the following file formats when you paste your graph using Edit > Paste Special: the Enhanced Metafile to maintain the clearest graph text or the Bitmap format to retain the crispness of graph points and symbols.
Creating Presentations within Quality Companion

Bonus tip: Did you know that you can create a presentation directly within the process improvement software Quality Companion? Rather than saving your presentation as a separate file and incorporating the use of an outside application, use the presentation feature to keep this important element of your project saved within your entire Quality Companion project file.

Your presentation can include images of tools, forms, and graphs you create within Companion, as well as any inside or outside content that you can copy and paste (for example, parts of a process map or content from other software applications). After you create a presentation, you can view it in Companion or easily export it to PowerPoint (Actions> Export to PowerPoint). You can also save the presentation as a webpage.

Presentation in Quality Companion

The above presentation was created in Companion, and is stored in the Project Roadmap you see on the left. The presentation can be viewed and edited from the Roadmap, and you can easily toggle back-and-forth from the presentation to other tools you’ve used for other project phases.

For more, read the tutorials in Using Minitab Output in Presentations and Reports or section 6 of Quality Companion 3 – Getting Started to learn more about the creating presentations within Quality Companion.

What the Heck Is Best Subsets Regression, and Why Would I Want It?

$
0
0

Last time, we used stepwise regression to come up with models for the gummi bear data. Stepwise regression is a great tool, but it has a downside: when we use stepwise selection in design of experiments, especially if we focus on only the last step, we can miss interesting models that might be useful.

One way to look at more models is to use Minitab’s Best Subsets feature. Instead of identifying a single model based on statistical significance, Minitab’s Best Subsets feature shows a number of different models, as well as some statistics to help us compare those models.

To get the idea, let’s look at some smaller models.
 

The best 4-predictor model has an r-squared of 80.2, and adjusted r-squared of 79, a Mallows' Cp of 50.1, and an S of 10.131.

The variables column (Vars) shows how many terms are in the model. In this case, I requested that Minitab print two models with 1–4 terms. Here are the statistics that Minitab provides to help you choose a model:

  • R2 is for when you compare models with the same number of terms. Higher is better.
  • Adjusted R2 is for when you compare models with different numbers of terms. Higher is better.
  • Mallows’ Cp is for when you compare models with the same or different numbers of terms, provided all of the models came from the same initial set of terms. If you change the predictors and run best subsets a second time, you cannot use Mallows’ Cp to compare the models. The smaller the Mallows’ Cp, the better it is for prediction. The closer it is to the printed number of variables + 1 (for the intercept term), the less biased the estimates of the coefficients are.
  • S is an estimate of the variability about the regression line. Smaller is better.

In the output above, the model with

  • the position of the fulcrum
  • the angle of the catapult
  • the number of rubber band windings, and
  • the interaction between the position of the catapult, the position of the gummi bear, and the number of rubber band windings

has the best statistics. However, the statistics for the other 4-predictor model are so close that it would be hard to say that one is practically better than the other.

With a smaller number of models to work with, you can use Minitab to check the predicted R2 values. Predicted R2 is similar to the other R2 type statistics, but estimates how well a model predicts new observations. Often, this criterion about new observations is the most appealing assessment to use. We want the model to predict what will happen when we launch a new gummi bear. As it turns out, the predicted R2 values are nearly equal for both four-term models: 77.09%. Based on these statistics, there’s no reason to think that one model will outperform the other, even though their predictions can vary considerably.

Next time, I’ll plan to take a look at some larger models to see if we can do better on the predictions. If you’re ready for more statistics now, take a look at what Jim Colton can show you about some common misconceptions about R2.

Holiday Baking: Using DOE to Bake a Better Cookie

$
0
0

sugar cookieIt’s the most wonderful time of the year – the time for holiday bakers and cookie monsters to unite! So what’s a quality improvement professional to do when his favorite sugar cookie recipe produced cookies that failed to hold their festive holiday shapes after being baked? Run a Design of Experiment (DOE), of course!

A Fractional Factorial Experiment

Bill Howell, an avid baker and quality professional, used Minitab’s DOE tools to get to the bottom of his sugar cookie shape faux pas.

Howell planned to design an experiment that would allow him to screen many factors, determine which were most important, then adjust his process to get the results he wanted—in this case, to make cookies that still looked like snowmen when they came out of the oven.

He elected to run a fractional factorial experiment, a class of factorial designs that lets you identify the most important factors in a process quickly and inexpensively.

Howell’s experiment required him to make 8 runs (or batches of cookies) to assess six factors, each of which was tested at two levels:

  • Oven temperature (325 or 375 F)
  • Number of eggs in a batch (1 or 2)
  • Ounces of AP Flour in a batch (9 or 13.5)
  • Baking soda amount (0.5 or 1 teaspoon)
  • Cream of Tartar amount (0.5 or 1 teaspoon)
  • Chilling the dough after rolling and cutout (yes or no)

Howell took extensive steps to ensure a robust process. He used 3 different shape cutters to prepare the cookie dough for the oven, selecting measuring points on each cutter, and measuring them with a 6 inch caliper accurate to .001 inch. Each of the eight experimental batches included stars, snowmen and gingerbread men.

To ensure consistent dough thickness, Howell used wood strips to prevent his rolling pin from flattening dough any thinner than ¼ inch. To minimize undue influence or unintentional bias during the baking process, he randomized the placement of the cookies on the baking sheet. He also rotated the baking sheets 180° halfway through baking.

Because two oven temperatures were used in the experiment, baking times varied by trial. The actual cooking times for each trial were recorded on the trial instruction sheet.

Each trial consisted of baking two trays of cookies. When they came out of the oven, Howell measured two samples of each shape from both trays to see if there had been a change in overall height, a selected width measurement, or thickness. These dimensions were recorded on preprinted forms, which identified the trial number, data of trial, cutter shape, width and height. Howell calculated averages and standard deviations for each cutter shape, and used Minitab to analyze the data.

DOE Results

An analysis of height and width measurements done in Minitab revealed that flour was the driving factor in spread of the cookie. “In each instance, a higher amount of flour produced less spreading from the original dimension,” Howell says. “Impact on cookie thickness was principally influenced by flour and the number of eggs in the batter. Two eggs produced more rise than one egg.”

cookie DOE - snowman thickness

Howell also used Minitab to create main effects plots, which examine differences among level means for one or more factors.

cookie DOE snowman thickness

Howell’s main effects plots reinforced the findings of the analysis, and also revealed that cutter type had an effect. “The width measurement for the star-shaped cutter moved an average of .35 inches, but the Snowman moved an average .95 inches and the Gingerbread Man moved an average of .55 inch,” he says. “This indicates that the shape of the cutter affects the flow of the cookie dough as it bakes.”

Howell is confident the experiment he designed and analyzed using Minitab will result in better cookies in the holiday seasons to come. “Cutout cookie batches this holiday season will follow the methods and levels that worked best in the experiment—with maybe just another ½ oz of flour thrown in—as these held their shape nicely, and the people who sampled from this trial liked their taste.”

Bill Howell’s Optimized Sugar Cookie Recipe

Howell’s recipe can be found at http://www.minitab.com/company/news/news-basic.aspx?id=10290. If you try the recipe, be sure to let us know how your cookies turn out. Happy Baking!

Have you ever used DOE to optimize a recipe? Let us know in the comments below.

Of possible interest:

How Statistics Got to the Root of My Turnip Soup Problem

Celebrating National Pierogi Day with DOE

Gummi Bear DOE


Sharing Your Minitab Custom Settings

$
0
0

It’s the holiday season, a time for volunteering and gift giving…and sharing!  In that same spirit, I thought it would be a perfect time to talk about sharing in Minitab.  As you may have picked up from other blog postings (see Stats Cat’s post on Minitab customization), there are lots of ways you can customize your Minitab to save time and create great looking presentations.  You can spend hours going through all of Minitab’s options choosing just the right settings to make your graphs and statistical output pop.  It would be a shame if you couldn’t share all that hard work with your colleagues and coworkers! 

To export your hard-earned settings, open Minitab and go to Tools > Manage Profiles. To export your currently active profile, click the left arrow button to move MyProfile to the Available Profiles box.  If you wish to rename a profile before exporting, double-click it and enter a new profile name. With the profile selected, click Export and then choose a location to save it.  You now have a saved registry file (*.REG) that other users can import and activate.

Manage Profiles

So you send your .reg profile to your most favorite coworker (or your second PC) and need to upload it into Minitab.  Simply go back in to Tools > Manage Profiles menu and click Import.  Select the new profile .reg file and Open. To activate the profile, choose it and click the right arrow button to move it from Available Profiles to Active Profiles. All profiles in Active Profiles are active.  If you want your new profile to override existing profiles click the up arrow to move it upward in the profile hierarchy. Click OK in the Manage Profiles box and…Viola!  You’ve successfully given the gift of Minitab custom settings!

*Note: Minitab custom settings are not to be taken lightly… making a coworker’s Minitab graphs all hot pink and mint green is not encouraged by Minitab (but it would be pretty funny!)

How I Came to Grips with Statistics in Manufacturing

$
0
0

The Minitab Fan section of the Minitab blog is your chance to share with our readers! We always love to hear how you are using Minitab products for quality improvement projects, Lean Six Sigma initiatives, research and data analysis, and more. If our software has helped you, please share your Minitab story, too!


LetterGrowing up, I had little or no interest in math. In Africa we had little or no manufacturing; therefore, I had no idea about manufacturing processes, quality improvement, and so on. There was nothing there to motivate me play with numbers. I used to be terrified to see charts and numbers because I was not familiar with the statistical tools and metrologies used to achieve results.

About 12 years ago, one of my supervisors recommended that I should attend Six Sigma training classes and help manufacturing with quality improvement.  It was a very scary moment for me. I was a college graduate with no math understanding, sitting with so many high -evel statisticians. It was frightening.

Shortly after my Six Sigma training I was given a laptop computer and Minitab Statistical Software. Slowly but surely I began applying Minitab's tools and functions to my work. Because, Minitab is simple, yet powerful, I had an incentive to build on my understanding of the tool, to go further and learn more about the practical application of it. Today, I could humbly say that I have a good understanding on the application, tools, and its impact on quality improvements.

Minitab gave me the tools to understand the manufacturing process better and more completely.

I would like to take this opportunity to say thank you to the visionaries and developers who created Minitab, and your partners around the world.  Thank you.

Ayele Zewdie
Six Sigma Black Belt
ACH LLC
Plymouth


Share your Minitab story!

Choosing the Right Distribution Model for Reliability Data

$
0
0

Recently I've been refreshing my knowledge of reliability analysis, which is the use of data to assess a product's ability to perform over time. Quality engineers typically use reliability analysis to predict the likelihood that a certain percentage of products will fail over a given amount of time.   

Statistical software will do the calculations involved in a reliability analysis, but there's a catch: first, you must choose a distribution to model your data. Put plainly, you need to tell the software to base its analysis on the normal distribution, the Weibull distribution, or perhaps some other, more exotic distribution. 
Why does choice of distribution matter for reliability analysis?
turbineLet's say you work for a company that makes engine windings for turbines. You're concerned that if these parts are exposed to high temperatures, they will fail at an unacceptable rate. You want to know -- at given high temperatures -- the time at which 1% of the windings fail.  
 
First you collect failure times for the parts at two temperatures. In the first sample, you test 50 windings exposed to 80°C; in the second sample, you test 40 items at 100°C. 
 
Naturally, you want your reliability analysis to be...um...reliable. This is where choosing the right distribution comes in. The more closely the distribution fits your data, the more likely the results of the reliability analysis will provide good information about how your product will perform. 
 
You want to use parametric distribution analysis to assess the reliability of the engine windings. But how do you know which distribution to choose for your data? 
Using statistical software to identify the distribution of reliability data
Textbooks suggest relying on practical knowledge or direct experience with product performance. You might be able to identify a good distribution for your data by answering questions such as: 
  • Do the data follow a symmetric distribution? Are they skewed left or right?
  • Is the failure rate rising or falling? Or is it staying constant?
  • What distribution has worked for this analysis in the past?
If you don't have enough knowledge or experience to confidently select a distribution, statistical software can help. In Minitab, we can evaluate the fit of our data using the Distribution ID plot, by choosing Stat > Reliability/Survival > Distribution Analysis (Right-Censoring or Arbitrary Censoring).
 
This tool lets us determine which distribution best fits the data by comparing how closely the plot points follow the best-fit lines of a probability plot. We'll choose the Right-Censoring option and fill out the dialog box as shown: 
 
Distribution Identification Plot Dialog
 
We're just going to compare our data to the Weibull, Lognormal, Exponential, and Normal distributions; however, had we wished, we could have Minitab test the fit of our data against 11 distributions by clicking the "Use all distributions" option.
 
When we click "OK," Minitab gives us a lot of output in the Session window, and also this graph: 
 
distribution identification plot
Choosing the best distribution model from the identification plot

We're looking to see which distribution line is the best match for our data. Immediately we can rule out the Exponential distribution, where barely any of our data points follow the best-fit line. The other three look better, but the points seem to fit the straight line of the lognormal plot best, so that distribution would be a good choice for running subsequent reliability analyses.

It can sometimes be difficult to tell which distribution is the best fit from the graph, so you should also check the Anderson-Darling goodness-of-fit values and other statistics in the Session window output. The Anderson-Darling values also appear as the "Correlation Coefficients" on the plot. The smaller the coefficient, the better the fit of the distribution.

For our data, the correlation coefficient values for the lognormal distribution are lower than those for other distributions, further supporting the lognormal distribution as the best fit.

Have you ever needed to identify the distribution of your data?  How did you do it? 

Making Statistical Software Work for Multiple Users

$
0
0

The Stats CatMarlowe the Stats Cat here. Earlier, I showed you how easy it was to set up my statistical software with a personalized menu of statistical tools I use most often. 

The problem is that I share a computer with one of the humans who live in my house, and the statistical tools I use most may not be the ones he needs to use. And I don't want to clutter my interface with a "Human" menu. 
 
I'm trying to be kind, but I should just be direct about this: as a cat, I have abilities that far oustrip those of my human. That extends to the range of statistical tools I can use effectively.
 
What I need to do is set the software up so that I have access to the full range of statistical tools available in Minitab—since I know how to use them—while limiting my human's access to only those functions that he can understand. Or at least those he won't get himself into trouble with. 
 
You cats who live with humans know exactly what I'm talking about.
 
Similar situations exist in human companies where, for instance, a Six Sigma Black Belt might want to customize the array of tools available to less experienced colleagues. Similarly, a teacher might want to limit students in the classroom to  those functions pertinent to the course. 
Providing Different Statistical Toolsets for Different Skill Levels
To do this in Minitab you just create separate Profiles for different users. Profiles store all the settings you've selected in Tools > Options and Tools > Customize. They also store custom date/time formats, Autofill lists, and value order settings for text data.
 
It's easy to use Manage Profiles > Manage to make, save and manage multiple profiles of options and customization settings in Minitab. To make a new profile, just click the button circled in the picture below and start customizing! Once your custom profiles are set up, all you need to do is choose the profile you want to use—or that you want your human to use—and make it active or inactive. 
 
manage profiles
 
But what if you create a very limited custom profile, and then want to add something else to it later? Or you want to get back to how everything looked when you first installed the software? No problem. You also can use "Manage Profiles" to restore Minitab's default settings, so you never need to worry about "losing" something you've removed from a custom profile.
 
Now I can continue to use Minitab's full interface. But when my human uses Minitab, he'll access the customized profile I created for him. I'm hoping it doesn't give him too much trouble: 
 
Menu for my human
 
Depending on your situation, you could make profiles for students or colleagues that provide only the tools they need. You can even rename menus and menu items to reflect how you're using the tools. You could have custom profiles for individual team members, or for different projects. For instance, in a healthcare setting, the G chart tool for rare events could be put in a menu and relabelled "Accident Chart." 
That's a Mighty Nice Toolbar...Mind If I Use It? 
Let's say you made a profile for one project that has a toolbar that would be ideal for a new project. There's no need to recreate the toolbar, just use Manage Profiles > Toolbars to trade customized toolbars between your own active or inactive profiles or between profiles with other users without having to import or export profiles.
What If You Want to Use a Profile on More Than One Machine?
I only need to worry about creating a profile for my human on a single computer, but in a work or classroom setting, you'll probably want to create a profile then make it available on the computers of everyone who needs it. 
 
You also can import and export Minitab profiles, so that users of multiple machines can share the same group of settings and customizations, as one of my human's co-workers pointed out.
 
It's good to know that if my human gets another machine, I can easily put the custom Minitab profile I've created for him on that one, too.  You know, just to keep him safe. And keep my data safe from him. 
 
One way or the other, I can rest easy. And I plan to do just that, as soon as I finish this pouch full of kitty treats.  

 

Violations of the Assumptions for Linear Regression: The Trial of Lionel Loosefit (Day 1)

$
0
0

jury

Bailiff: All Rise. The Honorable Judge Lynn E. R. Peramutter presiding.

Judge: Please be seated. Bailiff, please read the charges.

Bailiff: Your honor, this is the case of the State vs. Lionel Loosefit. The defendant is charged with creating a model that violated the legal requirements for regression. The infractions include:

  • Producing grossly nonnormal errors
  • Producing errors that lack independence
  • Exhibiting nonconstant variance
  • Violating the linearity assumption

Judge: Thank you, bailiff. Let’s hear the opening statement by the prosecutor.

Prosecutor: Your honor, ladies and gentlemen of the jury. We’re here today to try the defendant, Mr. Loosefit, on gross statistical misconduct when performing a regression analysis. You heard the bailiff read the charges—not one, but four blatant violations of the critical assumptions for this analysis. Yet, despite being fully aware of the egregious nature of these heinous infractions, the defendant knowingly and willfully used his model, in public, to estimate response values, violating the basic tenets of statistical decency.

 [Alarmed murmurs and sudden gasps in the courtroom.]

Judge: (rapping gavel) Order!! How does the defendant plead?

Defense Lawyer: My client pleads “not guilty” to all four charges, your honor.

Judge: (to prosecutor) Counsel, proceed.

Prosecutor: Your honor, these are serious offenses. Because the statistical penal codes are a bit murky, and because our court stenographer is a part-time Minitab blogger who types using only one finger, I’d like to ask the court’s permission to address each charge one blog post at a time.

Judge: Granted.

The Prosecution's Case: Errors of a Highly Nonnormal Nature

Prosecutor: Ladies and gentlemen, today let us examine the charge that that the errors in the defendant’s model lack normality. I’d like to start by calling our expert witness to the stand, Dr. Minnie Tabber, world-renowned statistician.

[Dr. Tabber approaches the stand and places right hand on a thick, leather-bound volume of the cumulative probabilities of the standard normal distribution.]

Bailiff: Do you swear to tell the statistical truth, the whole statistical truth, so help me Ronald Fisher?

Dr. Tabber: I do.

Bailiff: Please take a seat in the witness stand.

Prosecutor: Dr. Tabber, please explain your area of expertise.

Dr. Tabber: My specialty is statistics. My passion is Quality. Data. Analysis.

Prosecutor: And could you briefly explain to the court the requirement of normally distributed errors in a regression model?

Dr. Tabber: Certainly. The errors in any model are just the differences between the actual observations in your data and the expected values predicted by your model. No model is perfect. You expect errors—but you want the errors (which are also called residuals) to be reasonably normal.

Prosecutor: Your honor, I’d like to introduce Exhibit A, a normal histogram of the residuals from a regression model:

Exhibit A

Prosecutor: Could you explain what we’re seeing here, Dr. Tabber?

Dr. Tabber: Sure. This is a histogram of the errors from a model. The horizontal scale shows the difference between the data observation and the value predicted by the model. It tells you the size of the errors. Notice that most errors are at or close to 0—that’s what you want to see. It means most of the data don’t deviate much from the values predicted by the regression model. As the size of the errors become larger in either direction, there are fewer and fewer of them, and they’re spread roughly evenly on both sides.

Prosecutor: In other words, the distribution of the errors in Exhibit A is fairly normal.

Dr. Tabber: Yes. The residuals don’t have to perfectly follow the bell curve indicated by the blue line. But basically you want the highest bar to be close to 0, and the bars on each side to be progressively smaller—showing no strong tendency of the model to overestimate or underestimate values.

Prosecutor: Your honor, I’d like to introduce Exhibit B, the histogram of the residuals from the defendant’s model.

Exhibit B

Prosecutor: Dr. Tabber, would anyone in their right mind call these errors normal?

Dr. Tabber: It would take quite a few martinis to make that look like a bell curve. That model appears to be quite inconsistent in how it overestimates and underestimates response values.

Prosecutor: Of course. Even a kindergartner can see the errors are extremely skewed to the right. Where’s the left tail? Why, it doesn’t exist!  Did somebody steal the left tail from the defendant’s residuals? Perhaps it's been swiped by a wild, merry gang of number crunchers, who are using it to play a game of pin the tail on the histogram!

[Laughter in the courtroom].

Judge: (rapping gavel) Order!!

Prosecutor: No further questions your honor.

The Cross Examination: Do Bins Make the Bells?

Judge (to defense attorney): You may cross-examine the witness.

Defense Attorney: Dr. Tabber, you mentioned that the residuals don’t have to be perfectly normal, is that correct?

Dr. Tabber: Yes, that’s correct. They should be roughly normal to satisfy the requirements for regression.

Defense Attorney: Dr. Tabber, have you ever heard the famous quotation: “The only normal people are the ones you don’t know very well”?

Dr. Tabber: That sounds vaguely familiar…

Defense Attorney: What, really, is“normal”? After all, we all make errors, don’t we? Like taking a wrong turn...getting someone’s name wrong… brushing your teeth with shaving cream… forgetting to file taxes for several years in a row...

[Judge raises an eyebrow.]

Defense Attorney: Dr. Tabber, would you consider it normal to sing to yourself while driving alone in your car?

Dr. Tabber: Personally, I don’t do that—but yes, I’d consider that normal behavior.

Defense Attorney: Of course. What about talking to yourself out loud while driving alone in your car? And I don’t mean using a phone.

Dr. Tabber: Well, uh, I’m not sure…

Defense Attorney: Of course, you’ve never done anything abnormal like that, have you? All of your errors are perfectly normal, aren’t they?

Prosecutor: Objection, your honor! This is irrelevant—the issue here is the normality of the defendant’s residuals, not the normality of a statistician.

Judge: Sustained. Counsel, keep your questions addressed to the residuals.

Defense Attorney: Your honor, I’d like to introduce Exhibit C, another histogram of residuals.

Exhibit C

Defense Attorney: Dr. Tabber, would you be satisfied that the residuals in Exhibit C are normally distributed?

Dr. Tabber: I’d have some misgivings about that.

Defense Attorney: What concerns you?

Dr. Tabber: Well, the data appear to be bimodal—with two centers. And the high frequency of errors from 26000 to 38000 is troubling.

Defense Attorney: Troubling, yes. Indeed.

[Pauses, then turns to jury box.]

Defense Attorney: But you know what’s really troubling, Dr. Tabber? These are the exact same residuals as in Exhibit A—which gave us that lovely, beautiful bell curve. The only difference is that this histogram uses different intervals to define the bins. It's the same data in different cans.

[Murmurs in the courtroom]

Defense Attorney: Suddenly, we’ve gone from “normal” to “troubling”… just by changing the number of bins on the graph. Yet the evidencethe errorshaven't changed. Dr. Tabber, might we say then, that normality, like beauty, can be very much in the eye of the beholder?

Dr. Tabber: Well, that might be true in certain instances, but

Defense Attorney: Thank you. No further questions, your honor.

[Spectators begin nodding to one another.]

The Reexamination: Putting Evidence to the Test

Prosecutor: Your honor, I’d like to briefly re-examine the witness.

Judge (yawning): Keep it short. We’ve all got other work to do. And I’ll remind you our stenographer is typing with his pinky finger.

Prosecutor: Dr. Tabber, are histograms always definitive indicators of normality?

Dr. Tabber: Histograms are pretty accurate with large data sets, but with small data sets the binning intervals on the graph can greatly affect its appearance, as we have just seen.

Prosecutor: So is there nothing we can do? Must we just wring our hands and assume no one can say whether data is normal or nonnormal? Do we assume that we can’t separate up from down? Or that the planet is spinning senselessly in a chaotic cosmos devoid of meaning, with all distributions being relative and arbitrary?

Dr. Tabber: Most definitely not. Minitab has other tools to assess normality, such as the probability plot.

Prosecutor: Your honor, I’d like to introduce Exhibit D, a normal probability plot of the defendant's residuals.

Exhibit D

Prosecutor: Dr. Tabber, there are lots of points, lines, and numbers here—can you translate for us?

Dr. Tabber: The blue line shows the expected percentiles for the given distribution—in this case, the normal distribution. The red points show the actual values in the data set. Basically, you want the red points to fall along the blue line—that means your data fits the given distribution.

Prosecutor: So you want those big red dots to appear like a long, skinny caterpillar on the blue branch?

Dr. Tabber: I guess you could say that…

Prosecutor: Not like a snake that’s looped around the branch, its tail dangling toward the ground, its head in the air, slithering toward its innocent prey…

[Horrified gasps in the courtroom]

Defense Attorney: Objection, your honor!!

Judge: Sustained.

Prosecutor: So in this case, does the probability plot indicate that the errors in the defendant’s model are roughly normally distributed?

Dr. Tabber: Not at all. And neither do the results  for the Anderson-Darling (AD) normality test.

Prosecutor: Where do we see those test results?

Dr. Tabber: They’re shown by the p-value in the graph legend. If the p-value is less than the alpha level of 0.05,  we reject the assumption that the data follow the normal distribution.

Prosecutor: How reliable is this AD test?

Dr. Tabber: Well, the p-value is <0.005, so you could say it’s 99.5% reliable.

Prosecutor: So there’s no doubt in your mind these data deviate significantly from a normal distribution?

Dr. Tabber: Based on the histogram, the probability plot, and the Anderson-Darling (AD) test for normality, there’s no way these residuals could be called normal.

Prosecutor: Your honor, ladies and gentlemen of the jury. As we’ve clearly shown, the errors in the defendant’s regression model are not normal. In fact, these model errors can only be called shockingly deviant, flying in the face of all decent standards of normality!

[Pandemonium breaks loose in the courtroom.]

Spectator 1: “Lock him up!”

Spectator 2: “Throw the textbook at him!”

Spectator 3: “Take away his model!”

Spectator 4: “Make him eat his rotten coefficients!!”

Judge: (slamming gavel) Order! Order! This court is adjourned. We’ll take a 2-week recess before addressing the other charges.
 

Illustration: The Jury, by John Morgan (1861) public domain image.

Viewing all 403 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>