So, you’ve been assigned to work on your organization’s data analytics in order to show what variables are impacting your giving program, where your organization can improve its work processes, or what potential prospects are falling through the cracks. Now what?
Don’t panic! If you have a basic understanding of Excel and PowerPoint and know how to navigate your database, you’ll be fine. Below is a step-by-step guide on how to conduct basic analytics on your system.
1. Decide what you want to study.
Do you want to know what variables are driving the success of a particular program? Do you want to know whether a certain subset of former athletes is more likely to purchase season tickets? Pick a program that interests you and your leadership and go for it.
2. Get a white board or a legal pad of paper and write down all of the variables that you know are in your system.
You’ll even want to write down variables that are not in your system but might play a role in impacting this program.
3. Determine the main characteristics of your model subjects.
For example, if you are looking to increase the number of season ticket holders for your athletics department, maybe your test group of model subjects should contain everyone who has purchased season tickets in the last three years. This group is the one for which you will run the analysis.
4. Draw out your list of scatterplots.
These scatterplots will show how your variables are related. Don’t worry if you’ve never made one; Excel’s ability to create charts almost verges on magic. All you need to do is determine your independent variable (the variable that is controlled) and your dependent variable (the variable that the independent variable impacts). For example, if you are trying to determine whether age is an indicator of giving, then “giving” is your dependent variable (and will go on the y-axis in your charts and graphs) and “age” is your independent variable (and will go on the x-axis).
5. Meet with your IT/reporting department to explain the variables that you are going to need to pull out of the system.
Keep in mind that many of these variables will be necessary for future projects, so you may want to either 1) learn how to pull them out of the system yourself or 2) have a standard report created that you can use anytime you want to build a new set of models.
6. Have the reports sent to you (or download them yourself) as an Excel file.
You’ll need one file for the full group related to the program you are working on (i.e., all season tickets holders for the athletics department for all time) and one file for the study group (i.e., new season ticket holders for the athletics department within the last three years).
For me, it works best to keep everything in one massive Excel file with tons of tabs. Depending on your preferences, you may find it more useful to create a new Excel document for each chart that you make. The choice is yours.
7. Create a bar graph or column chart for every independent variable that you want to see if/how it affects your dependent.
For example, chart the age ranges of all of your season ticket holders for the athletics department if you are determining what variables affect whether someone becomes a season ticket holder. Charting this information allows you to quickly see what information you have in your database, what information is missing, what type of connection might exist between your independent and dependent variables, and for which variables the data is widely different for your full group and your study group (i.e., all of the members of your study group attend volleyball games, but very few members of the full group attend volleyball games).
You’ll need to create charts for both data sets. You can either present these charts in two separate slides (which I prefer) or in a comparison slide in your PowerPoint presentation.
I am a huge fan of GCFLearnFree.org if I don’t know how to do something in Excel. If you’ve never created a chart in Excel, I highly recommend this step-by-step tutorial. And if you need a refresher on the types of charts available in Excel, I recommend this tutorial.
8. Create a PowerPoint presentation with your charts.
Not only does creating a PowerPoint save you time later (as you’ll certainly be asked to present your findings), but it also allows you to organize and quickly view what you have already done in your project. Place each bar chart on a separate slide with a simple title. You can even create comments on a slide if you want to remember a particular quirk about a piece of data.
9. Once you have built your background information slides, you’re ready to create your analysis.
This is the fun part. The first thing you need to do is separate your variables into two categories: numerical and yes/no. For the numerical variables (i.e., age ranges), you will use scatterplots. For the yes/no variables (i.e., attended college), you will use the r-squared value. Remember, you will only create the scatterplots and r-squared values for the study group. You want to determine what is causing that particular group to undertake the desired action.
10. Create scatterplots for each of your numerical variables and insert them into your PowerPoint.
Here is a quick tutorial on how to create a scatterplot.
11. Create r-squared values for the yes/no variables.
This process is much easier than it sounds. To create an r-squared value, you simply input this formula into your Excel document: =RSQ(A3:A9, B3:B9). Replace the columns with the actual columns containing your information. Here is a quick explanation of what it does.
Basically, this will show whether a non-numerical value impacts your dependent variable. The results will vary from -1 to 1. If your result is close to -1, then the independent variable has a negative impact on your dependent variable. If it is close to 0, then there isn’t a strong (or any) impact. If your result is close to 1, then the impact is positive. In most cases, you’ll want your results to be as close to 1 as possible, but anything above 0 shows a positive impact. Take each of your results and put them in their own PowerPoint slides.
12. Create a recap slide explaining the top influencing variables and some suggestions for how to use the data to improve the particular program that you are studying.
Using the scatterplot and r-squared values data, you should now know what variables are positively impacting the study group. You can then look at your full group to see who has the same variables, as they are likely candidates for the activity you are trying to predict (i.e., if you find that five variables—over 60, married, have kids in college, have a dog, and drive a station wagon—are positively impacting peoples’ decisions to buy season tickets for the athletics department, then you’ll want to find people in your database who are not in the study group who also exhibit these five variables*).
*I absolutely made up these variables.
That’s it! While the task may seem daunting in the beginning—I know that it did for me—I hope this step-by-step process has helped you become more secure in your new job responsibilities and your ability to gather and determine the information requested. If you have any suggestions about other data analytics methods or on how to improve the above process, I’d love to hear them.
Emily Davis began as a research associate at the Indiana University Foundation and was promoted to development analyst in 2011. In addition to assisting her science- and health-related clients with research, she has completed research for the IU and IU Foundation presidents as well as on international prospects. She received her undergraduate degrees from Ball State University and her master’s degree from Indiana University Bloomington. In her free time, she enjoys completing freelance editing and formatting work when not watching Thomas the Train with her toddler.