It’s Data Season! A step-by-step approach to using data for instruction

I’ve said it before and I’ll say it again: school leaders and teachers are inundated with too much data and often do not receive enough training for how to deal with it.  Data is great and can be very useful, but only if you know how to use it. Otherwise, it’s a confusing mess of contradicting messages that can sometimes lead to bad decisions.  If you don’t know the best ways to look at your data, then it can cause more problems than it is worth. With that in mind, I’ve created a list of steps you can use when looking at your data that I hope will be helpful at making all that data useful and actionable!  I wrote a similar three-part series on this issue here, here, and here that covers a bit more of the basics of data.  It’s a good primer for this post.


Step 1: Form a Hypothesis:

Based on your high-level data, your classroom observations, and any other qualitative or quantitative data you have, you should have a pretty good idea of what you think the problems are in instruction in your school or classroom.  Use these ideas to from a set of hypotheses that you can use data to confirm.

A good hypothesis has the following characteristics: verifiable, specific, describes a relationship.

For example, the hypothesis “female students are better at math because they are tired of not having STEM role models” is not a good hypothesis because it is not verifiable with the data available to you and it is not specific.  A better hypothesis would be “female students are outperforming male students in math because they feel they have to counter the stereotype about women in STEM as measured by the question: ‘do you feel motivated to counter examples about your gender’s performance in STEM?’ on the student survey.”  

Forming a hypothesis or series of hypotheses before you dig deeply into the data  is hugely important. Data should be used to support what you are seeing in the classroom (not that it will always confirm your hypotheses. It may not, and in that case, think again and come up with another hypothesis to test that may explain what you are seeing).  So if you don’t have a good idea of what you are looking for in the data, it can lead you to false conclusions.


Step 2: Gather as much data as possible:

Data should be tailored to each hypothesis. That is, just because you have a piece of data, doesn’t mean it is useful for addressing the issue at hand.  If you have the right data, you should be able to answer “yes” to each of these questions: Is the data relevant?; Is it at the necessary granularity?; Is there more than one source of data?

Is the data relevant? This first question is pretty obvious. If your hypothesis is about the relationship between value for education and achievement, you don’t need students’ zip code or address.  Ensure that the data you are using to answer your question is relevant to the question you are asking.

Is it at the necessary granularity? This second question is a little tougher because educational data often has limitations.  A lot of times we are able to get school or grade-level data, but not classroom or student-level data.  The problem is that often the questions we have are about specific classroom practices or student groups.  This means that school-level data isn’t useful. You need data at the granularity required to answer your question. For example, our example hypothesis above cannot be answered with classroom level data.  You need to be able to compare genders and survey results, so student-level data is required. It’s possible you’d find that the relevant data isn’t at the correct granularity and then have to collect additional data to answer your question.

Is there more than one source of data? This third question is more of a check than a technical question.  In all cases, we want to triangulate as much data as possible (state assessments,  benchmarks, walkthrough rubrics, surveys, interviews, etc.). We want the quantitative data to back-up what the qualitative data is saying and we want more anecdotal information to inform how we look at those data.  The more sources that agree with your hypothesis, the stronger your findings are. If you test a hypothesis with only one data source, what you found may be a result of a bias within the data source rather than and accurate picture of what is happening.

Finally, you need to ask yourself if you need more data.  If your answer to questions 1 or 2 is “no”, you need more or better data.  If your answer to question 3 is “no”, you probably need more data. Sometimes this requires significant resources and may mean that the question isn’t important enough to seek out the answer to.  Sometimes it just means you need to look for additional sources of data that you already have. This is where you have to weigh the importance of the time and resources required with the potential impact of the analysis.


Step 3: Look for evidence:

There’s an almost infinite number of techniques you can use to analyze data, which can be intimidating; however, you really don’t need to know sophisticated statistics to glean a lot of information from your data.  If you can take an average, plot out data points, and subtract, you can do a whole lot!

A data analyst’s most powerful tool is the “Mean” function in Excel!  Almost all actionable data relies on group comparisons of averages. This is great, because it means anyone can do it!  However, there are some things to check before you accept the average of numbers as a reasonable description of a group. In addition to taking the average of scores, you need to plot out the points in a scatterplot.  This will allow you a visual way to look at the dispersion of the data. You’ll want to look for groups of outliers that may pull the average up or down and any groupings in the data points. Sometimes you’ll see two or more distinct groupings, which means there is an underlying factor impacting the group’s performance.  This requires further thinking and analysis to undercover.

In the image below, you can see that a scatterplot can tell you a lot about how strong the correlation is between the things you are looking at.  All of these datasets have the same mean, but the strength of the relationship is very different. Looking at the data can help you see that without doing any statistical analyses.  

Similarly, an outlier can bring down the mean of a group, and make the correlation look weaker than it is, but with a scatterplot, it can be easy to spot it relationship. That the examples below:  the relationships are actually pretty strong, but the outlier make it look less so. When you graph it, you can spot the outliers and the true relationship becomes more clear.

Something to note is that we don’t mean that you should ignore outliers. In this case, outliers are kids and just saying “this kid is weird” isn’t OK.  Instead, we note them and conduct additional observations to determine why that student’s data is different than everyone else.  It’s another tool to help direct differentiation.

And finally, sometimes data can cluster around a few points.  That means that the relationship isn’t as simple as a straight correlation.  Take the example data below. You can see that the dots are clustering around 5 distinct points.  This could mean a lot of things depending on your data, so it’s important to take a step back and think critically about if these are distinct groups or if another variable is impacting the data.  For instance if this was performance data, we may see student socio-economic status interacting with the data causing clustering. This pattern can make things a little trickier, but it’s important to see.

Group comparisons are another great tool.  They allow you to see if there are differences between groups in performance or attitude and can tell you a lot about where you should focus your attention.  They seem pretty easy, but there’s a few things to keep in mind. First, as mentioned above, you want to plot the data out to make sure the average is a reasonable summary of the group performance.  Second, you’re going to want to make sure you’re looking at all the relevant comparisons. For instance, if you are hypothesizing that English learners underperform on ELA assessments, you need to look not only at the relationship between English learners and ELA performance, but also between non-English learners and ELA performance.  Because if English learners and non-English learners have similar patterns, that’s evidence your hypothesis is incorrect. Additionally, looking at other students groups who may have similar patterns can be interesting. In this same example with English learners, look at the comparison between students qualifying for free and reduced lunch and ELA performance. You may find that what you were seeing with English Learners is actually a factor of poverty and not language.  Looking at all the evidence and being open to the response it gives you is important!


Step 4: Identify next steps or further analysis:

Often, the results of one analysis can lead to more questions.  If after a careful look at the data, you found that there is evidence that your hypothesis is not supported. This can lead to a need to dig deeper into the data and ask additional questions.

These questions may lead to additional hypotheses and analysis.  The repeated process of asking questions and conducting analyses is often used to get to a root cause or set of root causes for a particular issue.  Doing so can help you turn that data into effective action because the better we understand the issues, the better we can address them!

Analyzing data can be a challenging and time-consuming task, but it’s worthwhile and with a few tools and tricks under your belt, you can do it easily!  If you’re finding that you’d like additional support with using data to drive instruction, reach out to Ensemble Learning or directly to me, Dr. Leigh Mingle at  Ensemble has several support options to help ensure your school finds success for all students!

Share on facebook
Share on twitter
Share on pinterest
Share on linkedin
Share on email