Data Art Wk 5-7: Text Archive

View live: https://emilylin-itp.github.io/data-art/wk5-7-textarchive/language_dep_final/

Code here: https://github.com/emilylin-itp/data-art/tree/gh-pages/wk5-7-textarchive/language_dep_final

***Not very mobile friendly! Will really aim to get better at responsiveness!

—

Objective:

I am interested in the connection between language and emotional well being. Is there a way to spot the signs of depression, anxiety, or suicidal tendencies based on the words we use and how they are used? In verbal communication, I think people often don’t say exactly how they feel for the sake of keeping it together. But is there a subconscious way people who are depressed use words that subtly indicates their mental state? Is there a way we can spot who is struggling even if they don’t explicitly say they are? I just want to know if there is a way to read between the lines for depression.

My hypothesis: there is a language to depression. By looking into the work of writers who have killed themselves, I am hoping to test if this theory rings true. Could be wrong, who knows… but curious to see what the text analysis will show.

—

Credits:

Many thanks to Genevieve for the conceptual feedback and technical resources! The brainstorming session and references were so helpful.

—

Steps:
a.) Research and reading:

I’ve been looking into different articles about the connection between words and depression. Many findings suggest the importance of pronouns, absolutist words and auxiliary words in helping to indicate depression. There is less of a link between specific words and suicide, but crisistrend.org was able to take the top 35 words used when people called/texted about a specific mental health issue. Based on these findings, I chose key words that I wanted to use for filtering the poets’ work.

Screenshot 2019-10-17 17.21.28 Screen Shot 2019-10-20 at 10.17.40 AM

b.) Deciding writers + text:

Then, I chose my 3 writers, decided to stick with only poets because the word count between novelists and poets is just too off. I wanted to include Hemingway and Ingrid Chang, but using one of their books just created such a curve in the word count. Eventually ended up choosing 3 Confessionalist poets (Sexton, Plath, Berryman) who suicided. Kept it to be American poets just because who knows what gets lost in the translation of poems. Very thankful for https://www.poetryfoundation.org and https://www.poets.org for providing a database for us all.

b.) Concept and design:

There was a lot of information I wanted to put on this site that didn’t make it into the coded version. I had hoped to show the correlation between what was happening in these poets’ personal lives and with the content in their poems. This required keeping a timeline of both their life and work. The tricky part is that these poems don’t have great time stamps. Unlike novels, poems often get published in a collected poems type book and don’t have exact years of when they were written. Some poems were even published into a book after their death. In the end I decided to omit using years as an extra data point (though I really wanted to show the correlation) and just include a biography timeline. Not sure if this is working though.

c.) Coding it!

I followed along with the Shiffman’s A-Z tutorials on concordance and sentence histogram to get a better understanding of how to work with text. Here are some of the test codes.

I compiled all the poems I wanted to use into txt files. Using RiTa.js documentation, I was able to find key words in context. “kwic()” splits the sentence into 2 halves at the word. The word is it’s own variable and the phrase before the word gets pushed into one array while the other gets pushed into another.

—

Questions (No Answers Yet):

When using RiTaj’s kwic, it would give me a duplicate of the array. Also there are some weird things with “undefined” showing up when there are no special characters in the text.
- managed to hack it and put the duplicate data into an empty string, so it doesn’t print out twice but would like to know what is really going on here? Why is it printing a duplicate version of the array?

Resources regarding the connection between language + depression:

Helpful resources for text analysis:

More on associative arrays: https://www.i-programmer.info/programming/javascript/1441-javascript-data-structures-the-associative-array.html
perfect ex of what i want to do but their example code was not used (not as helpful as I thought it’d be): https://rednoise.org/rita/examples/p5js/KWICmodel/
Super resources from Allison Parrish! This was so helpful for converting my txt file to json! I didn’t end up using this but it was nice to know there’s this option: http://static.decontextualize.com/lines-to-json/
- another blessing from her: https://creative-coding.decontextualize.com/intro-to-ritajs/
https://github.com/veev/DataArtFall2019/blob/master/section-2/03-rita-pos-matching/sketch.js
Another amazing set of tutorials from Coding Train:
- https://shiffman.net/a2z/text-analysis/
- https://www.youtube.com/watch?v=lIPEvh8HbGQ
- I can’t even begin to describe how perfectly he summed up how I’ve been thinking/feeling about pronouns /use of words + depression/emotion. Will have to read that book he mentioned – “The Secret Life of Pronouns.”

Resources for coding (in general):

grids! need to make my own! like bootstrap but should really learn how to code this by hand: https://cheesecakelabs.com/blog/dear-designers-love-developers-learned-display-grid/

Related