After Recording Data With Two Apps Every Single Day During 2021, I Performed an Eye-Opening Statistical Analysis on Mental Health

Using Python, I used one year of data exported from mobile apps to explore several different trends. I discuss the results and how personal data analysis skills can help you.

Jake Haines
8 min readJan 1, 2022

Preface

Mental health concerns have increased since the start of the disease of the century, seeing no signs of getting better. It poses a huge risk to younger generations, in which many people don’t have resources readily available to manage mental health successfully.

As a data scientist, I have a feline curiosity for the world around me, and crave coming up with unconventional ways to gather knowledge that can provide better insight to problems.

Photo by Brett Jordan on Unsplash

Disclaimer

This data was submitted voluntarily by a friend.

Do not use medication without prior consultation with a doctor. If you’re considering medication and don’t know what to do, reach out to me. I am happy to help you find a good psychiatrist.

Please reach out if you need someone to talk to, I am more than happy to help.

Motivation

My whole life, I also seemed to have a natural curiosity and drive to research issues in my life until a solid understanding of where they come from is reached. As such, my natural reaction to struggle was to map people’s struggles using data.

I wanted to look at trends between mood, times, and emotions.

My thought was this could hopefully provide more insight into what should matter to the individual and what shouldn’t, in order to optimize where they put their efforts (as one does).

If you’re interested jumping straight into the notebook, have at it: here’s the Github repo.

Data Acquisition

Over the last year, they used two apps to track mental health data:

  • Pixels (iOS/Android): daily mood tracking, with the option to log various emotions that occurred during the day
  • Round (iOS): by far the simplest method to track daily medication intake

With the exception of a few days of tracking missed in Pixels and a cease in logging after six months, data was recorded every day from Jan 3, 2021 to December 31, 2021.

Both apps give you an option to easily export data. Pixels exports to JSON format and Round exports to CSV.

Data Transformation

For the data janitor stuff, I used pandas and some built-in datetime functions.

For Round medication data, I had to standardize the string values containing dosage, since some contained the medication name and unit label. I also had to do quite a bit of tinkering with string values containing date and time, since .astype() did not work for that. Additionally, irrelevant columns were dropped. Transformed Round data looked like this:

For Pixels mood and emotion data, I had to do similar things. The Round and Pixels tables would eventually be joined, so I needed the date to be standardized. Additionally, irrelevant columns were trashed. The transformed data looked like this:

After merging the two tables with df.merge() and adding several additional time columns:

Using df.describe(), I already found interesting and simple statistics to look at, highlighted in blue below. Mood is a measurement of mood from 1 to 5, 1 being awful and 5 being amazing. Their average mood over the year was a little over neutral — suggesting they had plenty of good days! Additionally, it looks like they remembered to take medication 96.5% of the time.

Using the Pixels data, I also created an overall frequency table of the emotions in the list records in the emotions column. More on that later!

Additionally, with the emotions data, I assigned a numeric measure to each emotion that could be listed: -1 being a negative emotion, and +1 being a positive emotion. Using these values, I created a score statistic for the emotions data. That is, the average emotion score for each day.

There were a few more tables I created. In these tables, I created the following statistics, all per given unit of time (week of year, month of year, season of year):

  • mood variance & mean
  • habit score: measures what percent of that unit of time meds were taken
  • time variance: variance of (unix time / 1E7)
  • emotion score variance & mean

For example, the seasonal distribution table looks like this:

and the others alike.

Data Analysis

For most of the analysis, I used plotly to visualize various scenarios. I also used a t-test in two scenarios. Below, I’ll make inferences based on what’s seen in each scatterplot.

Now, I’ll show the results of two t-tests I performed. The first looks at a difference in mean mood between summer and winter:

The second looks at the difference in mean mood between taking medications and not taking my medications:

The last thing I looked at was a colorful bar chart of emotions, and their overall frequency throughout the first six months of the year:

Results

First, I’ll go over my thoughts on the scatterplot matrix.

  1. Mood Variance vs Week: This illustrates the trend in mood variability throughout the weeks over the year. The mood became less turbulent over the last year!
  2. Mood Variance vs Mean Mood, weekly: When calculating distribution statistics, the variance and means were calculated on a per-week basis. Here, the downward trend suggests that there is some relationship between mood variance and mean mood in a given week. This could mean that in a week, less variability in mood is associated with a higher mean mood. Not surprising!
  3. Mood Variance vs Habit Score: There is no clear trend here, although the slight decrease on the linear trendline could be indicative of a relationship between how good they were about taking medications and their mood’s variability.
  4. Mean Mood vs Habit Score: This trendline almost looks like a direct reflection over the x-axis. This suggests some collinearity between mood variance, mean mood, and habit score. Or, they are all related to each other. It makes practical sense: remembering to take medication consistently usually leads to a happier mood, and more stable mood. And a more stable mood over a week is related to a happier mood in that week.
  5. Mood Variance vs Time Variance: There isn’t much of a trend here, but it is obvious that most records were clustered on the lesser side of time variance. This shows they were good about taking medications during the same timeframe of day.
  6. Mood Variance vs Month: Very obvious trend. It is consistent with Plot 1, and it can be said that their mood became more stable over the course of 2021.
  7. Mean Mood vs Month: This is my favorite. There is a clear story here, and I used a lowess curve as opposed to a linear to show it. Their mood averaged highest in the summertime and fall, and dips down below 3 in the colder and rainier months.
  8. Mood Variance vs Mean Mood, monthly: The per-month calculations equivalent of Plot 2. This confirms the previous conclusion that an increase in mood is associated with a decrease in volatility.
  9. Habit Score vs Month: Overall, it looks like consistency with taking medications went up throughout the year and remained perfect at the last few months of 2021.

The results of the Summer-Winter t-test (p = 0.0266) suggest a significant difference between average mood during summer months and winter months — unsurprisingly, the summer mood being greater than the winter mood (p = 0.0133).

The results of the Taken-Not Taken t-test (p = 0.6662) suggest there is not a notable difference in mood when looking at whether medication was taken. However, most of the time, they only missed one day of meds at a time. If they were to not take my medication for three days in a row at times, I would likely see a much stronger relationship here, since that’s about the time withdrawal symptoms come on for these medications.

In the bar chart, it was super interesting that while most of the more frequent emotions were positive, the top emotion was tiredness. Tiredness is a usual symptom of many psychiatric medications.

Conclusions

So, here are my takeaways:

  • They’re happier with less mood fluctuation. They should aim to better be in touch with and manage their mood, as to keep improving.
  • Mood fluctuation went down overall, which would mean they slowly grew happier overall. That’s good progress they should recognize.
  • Consistency in taking medications seems to be associated with increase in mood and decrease in volatility. They should remain consistent with taking medication.
  • Mood tends to be lower in the cooler and darker seasons. They should consider living somewhere warmer and sunnier in the future, and make sure there is extra effort to generate energy in those seasons.
  • Since the emotions frequency data was only recorded in the first six months of 2021, they should monitor my energy throughout the day and talk to their doctor still experiencing consistent tiredness.

Overall, the data created and analyzed provided very valuable insights. The bolded statements above are emphasizing acting insights. An important note for you is that the universe, circumstance, the world, and your struggles, will not wait for you. They will not change for you. If you have insight and want something to change, then do something.

Why is analysis of personal data overlooked?

Who ever said data analysis had to be strictly a business or academic pursuit? The insights data analysis create can be incredibly powerful. Use these tools to your advantage, and consult professionals (and friends/family) when needed. I find peace in knowing what’s going on around me and within me, and you might too.

Personal data science projects are fun (yet sometimes incredibly frustrating), boost your experience, increase your domain knowledge, and may help you improve your own life.

Companies collect a ton of data from you all the time. Find that data, create your own data, and use it — it’s one of the most underrated skills out there right now.

As usual, let me know how you like this article. Feel free to comment and share. If you need anything, and I mean absolutely anything, email me or message me on LinkedIn.

Till next time!

--

--

Jake Haines

Data Engineer // Ex-Tesla // Statistics @ NC State Univ.