Amanda Levy's Project 1 Analysis

  1. Looking at overall participation color-coded by sex

    1. How has overall gender participation changed over the years between NOCs (just looking at the 15 NOCs mentioned below)?

    2. How has overall gender participation changed over the years between sports?

  2. Looking at impact of winter versus summer Olympics (season)

    1. How has overall gender participation changed over the years between winter and summer Olympics?

    2. How has overall gender participation changed over the years between winter and summer Olympics across NOCs?

  3. Looking at filtered to female participation

    1. How has female participation changed over the years between NOCs?

    2. How has female participation changed over the years between sports?

  4. Looking at each of the top 15 NOCs in terms of female entries

    1. How has female participation changed over the years for this NOC by each sport?

      1. USA

      2. FRA

      3. GBR

      4. ITA

      5. GER

      6. CAN

      7. JPN

      8. SWE

      9. AUS

      10. HUN

      11. POL

      12. SUI

      13. NED

      14. URS

      15. FIN

Class Activity:

  • Look at all six of the graphs in the first three categories

  • Look at assigned graph(s) in the fourth category

  • In the wiki, write one takeaway from each graph

Collaboration Aspect:

  • Everyone is gathering aspects on the same first six graphs

  • Each person/group gains insight and expertise on the NOC they examine for the fourth category, and it is the collaboration combination of all these takeaways that lets the group understand more deeply the two graphs in the third category

Additional to-do for me, document my Virtualitics experience and compare it to existing documentation on Virtualitics.

  • There is not good documentation online, especially with regards to what data type is needed in the dataset for each feature to work and for each plot type. For example, having data on areas of the world does not mean that the 3D map will work. From the videos I saw online and my failed trial and error attempts, it seems to be that longitude and latitude is needed for that.

  • Additionally, it can be challenging to manipulate the dataset once it is uploaded into Virtualitics, which can pose a challenge for satisfying the different data type criteria for the plot types. For example, a histogram can take a categorical variable and show the frequency by the height of the bars, but for a line plot to show the count of a filtered categorized variable, proved extremely difficult and more time consuming. You needed a more advanced understanding of how to make additional columns of data in Virtualitics and how to change variables to numerical representations and applying sums to those and inputting that number as a datapoint. Virtualitics does not have detailed documentation on how to complete this beyond identifying data manipulation as a feature.

  • Virtualitics smart mapping feature was extremely helpful for an initial and speedy orientation of the best visualization approach for the data and then it's much faster to customize from an already completed format. It’s like looking at an example and working backwards, which is helpful in a trial and error system where many of the features are not compatible with the dataset. You get to start with what works with Virtualitics and sub out what does not work with your goals, and you ultimately end up changing less features than if you started from scratch. It also helps brainstorm alternative perspectives to look at the data.

Notes from comments in the in-class activity:

  • Looking at Q1

    • It is a lot of data to represent succinctly in a graph. Participants noted that the most clear representation was the VR immersive bar graph. They noted that a line graph would be even more messy to track the progression through the dates with the high number of lines crossing over each other. They found that the color breakdown with men and women represented in one bar to show the total amount and the frequency as well as percentage breakdown was the most comprehensive and efficient approach. They also mentioned that this view was digestible because participants could zoom in and rotate the graph around. The participant using the VR immersive view did not find this graph overwhelming and eagerly jumped into the analysis by zooming and rotating the graph around, whereas others who viewed the graph from the desktop application view found the number of data points to be very overwhelming and had to take a moment to process before analyzing any of the data. Additionally, the understanding based on the desktop application view is limited by how it is much slower and not effectively calibrated to spin and zoom in on the graph in the desktop instead of VR view.

  • Gymnastics and swimming are repeatedly the sports that grab people's attention

    • This makes sense that it is a repeated comment because the graphs go deeper and deeper into the same data to look at smaller subsets, so the overall takeaways should corroborate and build on each other.

  • For Q3

    • there are a lot more women who do Summer Olympics than Winter Olympics

    • started to grow significantly around 1986

  • People initially look at the axis instead of the legend to identify the bars. This can often results in misidentification. Athletics was confused for arts competition. Additionally, people commented that it was hard to read the axis with so many categories, so having the bars color-coded (which was the case when filtered to only look at female) was very helpful. Another helpful approach to address the difficult-to-read axis is to have someone sharing what the colors and axis are while a partner scales and rotates the graph in 3D. This is great because it makes the Virtualitics experience (even when limited by the number of licenses) even more collaborative.

  • If there's a setup for a study administrator and participant, then the most efficient approach is for the study is to have participant keep the headset throughout the entire analysis session. The study administrator will switch between projects as they follow the folder sequence and enter/exit VR mode. This will result in the message to "Please Take Off the VR Headset". The participant should ignore this message and take off the headset.

  • Athletics is a vague category, so I had to look at the original data to analyze what this category represents. Looking at the data, I checked what events were detailed under Athletics and recognized that they were the Track & Field events.

    • One disadvantage of Virtualitics is that there is not way to edit that data once it is uploaded as a CSV into Virtualitics. Data manipulation, such as eliminating entries for not as frequently hosted sports or renaming Athletics as Track & Field, must be completed outside of the application and reloaded in as a new CSV.

    • Because you cannot change the axis, it is hard to keep people's attention on the sports you are trying to answer questions about. People get distracted by the presence of sports such as Arts Competition. Thus, the setup where the sports or NOCs are just filtered to show the ones that we want to analyze (the top 15) is helpful. This is another reason that filtering to female, which allows us to color code the bars by sports or NOC instead of breakdown of male/female, is very helpful.

  • Looking at Q4

    • CAN

      • It is hard to know that this project is analyzing Canada. I have not figured out how to add a title to a graph in Virtualitics. Thus, I think the best approach is to make the file structure as simple, clear, and explicit as possible.

    • FIN

      • The 3D VR immersive experience of Virtualitics helps emphasize the heights of each bar graph column. This made it faster and easier to identify that there are a lot fewer female participants from FIN than in comparison to the previously viewed country (which was CAN in this case). It is helpful that the bar graph heights to scale to be proportional to the total number of participants. The raw number/heigh is more effective for comparisons.

      • When the heights are smaller and differences might be smaller, it is helpful that in the VR view, the participant can zoom in and out and twist around.

      • The most obvious takeaway was that the number of gymnastics participants were dramatically lower than in the overall graphs. FIN's contribution to the significant presence of women in gymnastics seemed lower than the average, so it emphasized that the comparative high number of gymnastics participants must be the result of a relatively super high number of female gymnastics participation in another country.

        • An interesting comment is that China is not in the top 15 of countries with the highest female participation, so it was not one of the NOCs that were analyzed in a deeper perspective or color-coded. Thus, the impact of its female gymnasts was not carefully analyzed in terms of how it could component the lower impact from FIN.

        • To see another country that counteracts the lower number of gymnasts from FIN, see below ( FRA)

    • FRA

      • See above note about the number of female participants in gymnastics. FRA had a comparatively high (against other sports for FRA and other countries) and noteworthy number of female participants for gymnastics.