Consolidate data collaboratively
Given a large amount of data collected from interviews, observation, or other research, use collaborative data consolidation to build affinity diagrams that distill and find meaning in your data. Use these affinity diagrams as a foundation for decisions and design ideas.
Throughout the design of a project you'll collection huge amounts of information. You might collect it in the form of interview notes, notes of observations from a customer site visit, or observations from observing a usability test. There may be a lot of these notes. For a modest number of notes your head is quickly able to distill these sorts of notes to identify common themes, and important ideas. It becomes substantially more difficult if there are a large number of notes, or if those notes cross a variety of interviews or instances of observation.
Many different people on your team may be collecting notes. And, while you may be able to distill your notes in your head, and your teammate in their head, arriving at a single common picture through discussion and sharing data alone can be difficult.
Identifying common themes and important ideas in large amounts of data gathered over time or by multiple date gatherers is extremely difficult.
One approach to solving the problem might be to collaboratively review notes, and as a group decide on common themes and important ideas to capture in another document that distills these notes from a variety of sources. However as time passes, it's difficult to look back at the distilled document and recall how the information was arrived at. Tracing the distilled information back to the specific details from interview or observations may be impossible. In environments where traceability is important, this can be a big problem.
As time passes, additional data may be gathered. New data may causes shifts in importance, or changes to the major themes that emerged from the original data. Considering the impact of new data on an already distilled document is difficult.
Use a collaborative data consolidation session to quickly consolidate raw data using an affinity diagram. Knit new data as it's collected into the same affinity diagram and allow that new data to affect how existing data is summarized and characterized.

Data across a number of interviews can be collaboratively consolidated into a single heirarchical affinity.
A simple affinity diagram clusters like items together. Those clusters might be labeled with a note that characterizes the items in the cluster. While you might use the concept of an affinity diagram in a number of contexts, it's especially useful to in distilling large amounts of gathered data.
Building this affinity diagram collaboratively involving those who collected the data, and those who can most benefit from understanding it allows the much of the data, and the important themes in the data to become part of the collaborators' tacit understanding.
The collaborative data consolidation approach described here is based on the affinity diagramming process described in Holtzblatt and Beyer's Contextual design. The approach has been generalized a bit since it works well for consolidating data for a variety of types including data about customer and user needs.
Given a large amount of data to consolidate, plan and execute a collaborative data consolidation worksession to distill this data. This worksession follows the general approach of a collaborative work session where the resulting model is the affinity diagram. Follow the collaborative modeling session approach using these specific details:
Preparation
Set Goals
To set goals for a collaborative data consolidation session consider the nature and volume of the data to be consolidated. What will this data be eventually used for? The goal of this sort of session should be fairly clear cut: to distill the data across some number of sources for some particular eventual, likely design, use.
Write a short statement to describe that goal. It'll come in handy when scheduling meetings.
Identify participants
Ideally include all team members who acquired the original data. They will be our information suppliers.
They along with all other participants will be information modelers and information acquirers.
Choose a team member to function as a facilitator. It's OK to share the job with other team members. Try to avoid asking someone to facilitate at the some time they're supplying information or modeling.
A good collaborative data consolidation session would ideally contain 4-6 participants. Two might be a minimum, and more might be a good idea if there's a very large amount of data to consolidate.
Prepare Data
The data will need to be prepared for consolidation into an affinity diagram. Each distinct piece of data needs to be on a post-it note, or a regular sized slip of paper.
For each source of data, you'll need to identify the pieces of data you wish to consolidate. Not all the information you collected is worth consolidating. Take a highlighter pen to the data and highlight the data you wish to include. This process works best done as a paired activity. Contextual Design formalizes this process in a collaborative worksession referred to as an interview interpretation session.
When identifying data to include, you'll need to separate each distinct piece of data into a single data note. Each idea could be a single data note. This may cause you to split some of the information you've written down into multiple data notes.
To create data notes, it works well to key notes into a spreadsheet, then merge them into a word processor or other tool that will print labels. You can print notes onto regular paper, or pre-perforated card stock. When using regular paper, make sure you have repositionable tape to allow the note to behave like a sticky note. Optionally you could use tape or glue sticks to glue notes directly to stickies. And, while it could take a bit of time, you could hand write the notes directly on stickies.
When placing the data on stickies, make sure you use the same color sticky or note card for all your data. If you're looking for a suggestion, use yellow.
It's very important that when you get your data onto data notes that each note is identified with the source of the data, and a distinct number. For instance if you were consolidating data across several user interviews you might number those user interviews with a user number - U1 for instance could be user one. If in user one there were 30 pieces of data you wished to consolidate, number each item 1-30. This may leave you with data notes with numbers like: U1-23. Preparing your original data in a spreadsheet makes number easy. Place each piece of data in a spreadsheet row, number each row, then merge the spreadsheet onto note cards, or transcribe notes from the printed numbered spreadsheet data.
Prepare Supplies
To build a large affinity diagram you'll need poster paper to fix the data notes to. Use a long roll of poster paper, or butcher paper. White works best, but I sometimes use brown if my data notes are printed on white paper so I can better see the data notes.
You'll need sharpie style markers.
You'll need additional post-it notes in at least three additional colors. Pink, blue, and green are good choices.
Single sided removable tape works well in case you need to fix data notes down to the affinity diagram so it can be moved. Since the tape is removable, you'll be able to pull it off and continue work at a later time.
You'll likely find uses for the general collaboration supplies in your collaborators toolkit.
Schedule the Worksession
Schedule the worksession in a room with walls that will hold the very large affinity diagram you're about to build. Select a room that you'll be able to keep for the duration of the affinity modeling process since moving the diagram to another location can be time consuming.
Creating an affinity diagram may take a fair bit of time. In Rapid Contextual Design, Holtzblatt, Wendell, and Wood say a single person can place 50-80 notes per day. Given that rate, eight people might place 400 notes in a single day's session.
Schedule the process to allow for breaks every 90 minutes. Like all collaborative modeling sessions, this is brisk, tiring work. If it's not, you're not doing it right.
Performing
Kickoff
During the kickoff repeat the goal statement you wrote during preparation. Introduce those supplying data and briefly allow them to describe the data their bringing including the number of data notes.
Reviewing this affinity diagramming process with the participants. Remind them that during the process everyone has the responsibility of placing or rearranging notes as they see fit. Discussion is discouraged until most, or all notes are placed. If discussions break out, have a parking lot ready to park them.
Remind participants when breaks are scheduled and encourage them to take breaks.
Model
Prepare for modeling
Modeling proceeds a bit like a game where everyone supplying data gets a turn. And in response to the current data suppliers turn, the other data suppliers make a response move. So that everyone get's to play, for the session participants that don't have data to supply, data suppliers should share their data notes with them. Give these honorary data suppliers time to familiarize themselves with their newly acquired notes.
Now all players lay their notes out in front of them so they can easily locate specific notes.
On or two participants, depending on the number of total participants can function as moderator, and runner.
Read data aloud and place data
Start by choosing the first information provider and asking that person to read their first data note. Place the data note on the affinity diagram. A runner could grab and place the note for them if it's more convenient.
Read and place like data
After reading and placing a data note, all the other data providers review their data notes and identify notes that have affinity with this note. This means notes from different interviews, or different observations that are stating roughly the same thing, or expressing roughly the same sentiment. When a like note is found the information supplier calls it out and sticks the note immediately under the note it's similar to in a column. A runner could grab and place the note if it's more convenient.
After all like notes are placed, continue to the next data supplier asking them to read aloud and place a new element of data. Then allow all other suppliers to read aloud and place like data.
Place like data silently
As the work session progresses you'll see a large number of distinct columns begin to appear. At some point you'll find that as each data supplier reads a note, that a column already exists that the notes can be placed into. This happens when suppliers inadvertently miss data during the read and place like data phase of the game.
At this point in the game, all the data suppliers can stand up and silently place the remainder of their notes.
Celebrate
You're half way done.
Begin distillation
At this point with all the data represented, it's time to begin making sense of this data. As a group, led by a facilitator begin to walk the length of the affinity diagram
Split clusters
Each cluster or column of notes should contain some number of notes that have a high degree of affinity. When a cluster begins to get large - greater than five notes - you'll notice that there may be a couple different themes emerging in the data. Try to split large clusters into smaller clusters that more closely represent a single idea.
Combine clusters
When an affinity diagram contains a large number of clusters, you'll find that multiple clusters are inadvertently created for the same idea. Combine these clusters into a single cluster.
Distill clusters
Now it's time to begin choosing labels that summarize the cluster. For each cluster write a data notes that can represent the data in this cluster. Write this distilling note on another color of sticky note. Let's choose pink. This cluster distillation is the most critical step of the process and can have the biggest impact on the usefulness of this data.
It's important that you don't create simple categorizations for your data. From the dictionary I'm looking at right now, I'll choose this definition of distill: to extract the essence of. The note you write to represent this cluster must extract the essence of all notes in the cluster and function as a stand in for all the notes in the cluster. A well written note allows me to bypass reading the other notes in the cluster - unless I'm really really interested.
Here's an example. Let's say I have the following notes from a series of interviews:
- Bob prints the reports and uses a highlighter pen to mark issues that stand out to him.
- Cecily reviews report to identify items that need her attention.
- Ron uses the report to find other instances of the same problem he's trying to solve.
If I created a distilling note that read:
- "report issue identification"
I'd be guilty of just creating a category heading for these notes. There aren't enough words there to tell me what report issue identification is or mean or why someone would do it. I'll need to read the notes to get that information and the flavor of exactly what the users interviewed said.
However, if I create a distilling note that read:
- "I review the report and market specific issues for further investigation."
I've created a note that extracts the essence of the other notes. I've also done another neat trick and converted the note into a first person sentence using the voice of the user. Of course you have to imagine Bob, Cecily, and Ron all saying the exact same thing in unison since they only get one sentence but I could certainly imagine them all saying that single sentence since it seems to express what they're concerned about.
To distill clusters of a large affinity diagram break the group into pairs. Walk the wall from end to end and discuss as a pair what the correct distilling data note might say. Write the data note on a sticky using an easy to read fat felt-tipped marker.
After doing this you should have cut the volume of notes you have to read to understand this data down to a quarter of the original size - meaning to understand 400 distinct pieces of data, I only need to read 100. And, what's more I still have traceability back to the original data.
Wow, that's impressive. But the fun's not over yet.
Categorize clusters
Where in the previous step we were cautious about not categorizing, in this step we'll allow it.
By now you'll start to see clusters forming around major themes. As a group discuss the themes you're seeing. Choose candidate categories or themes that these clusters could fit into.
Use the goal statement to help guide your choice. If you're consolidating user interview data focusing on understanding the work practice of these users, choose categories that reflect the names of the major activities that users engage in. If your group is working with business stakeholder interview data for the goal of identifying business goals, choose categories that reflect possible business goals, profit opportunities or risks. The category names you choose should be useful given the intended use of this data.
Write your candidate category names using a felt tipped marker on another color sticky. Let's chose green this time.
Place the category names high on the poster paper affinity diagram.
All team members then work together to relocate each cluster to a place under the category the cluster is best associated with.
Cluster clusters and distill
At this point you will have a modest number of categories labeled with a green sticky collecting lots of clusters of yellow data notes each capped with a pink distilling sticky.
Now it's time to distill further distill the clusters in each category. Look at the pink distilling sticky at the top of each cluster. Think of those distilling stickies as you would any other data note. Look for other pink stickies they have affinity with. As a team relocate whole clusters so they appear together in the affinity diagram. Then, as you did with the yellow stickies before, break into pairs and write distilling stickies for each cluster of clusters. Let's choose blue stickies this time.
The same distillation guidelines apply. Try to write a statement on the blue sticky that extracts the essence of the pink stickies it organizes.
Celebrate again
You've completed the bulk of your model.
Walk the wall
At this point you should have a completed affinity diagram of consolidated data organized into a 4 level hierarchy. The top, green, level categorizes the data. The next level, the blue level, gives a high level summary of the information. The pink level gives more detail, and finally the yellow or bottom level gives the specific data from the specific sources that led to the distilled statements at higher levels.
As a group walk the length of the affinity diagram. Discuss anything interesting or surprising you see.
Inspect the data in each cluster to make sure it still looks correct. It's OK to relocate notes or clusters if things don't quite look right.
Reread the distilling pink and blue notes. Discuss them as a group. Make sure they can stand alone as useful information and that they accurately reflect the data notes they distill. Rewrite and replace them as necessary.
As the group walks the wall, look for any areas of weakness - areas where you could use more data. Make notes on another color sticky - a color you haven't yet used and fix them directly to the model in the category that apply to.
Wrap-up
After walking the wall as a group discuss any observation you have. Is there any further work to be done? Are there week areas of the model where you might need more data?
Take a model photo of what you've built. You'll likely have to take several photos and stitch them together. Take a group photo or two to document the occasion.
Consider making a model movie. Have one of the participants walk the wall discussing the data, key findings, and ah-has in the data. Try to keep the recorded video to a few minutes, which may be difficult if you have lots of data.
As a group give parting takeaways.
Document & Communicate
Ideally the affinity diagram can stay posted permanently in a team area. However, things aren't always ideal. If you need to take the model down, fix the stickies to the model using removable tape so the stickies can be repositioned if necessary. Then fold the poster paper carefully. I've learned the hard way that rolling them up doesn't work quite as well.
Record data electronically
If the information needs to be communicated electronically, this sort of data can be organized into a spreadsheet fairly easily. For each row in the spreadsheet create a column for the group, first level distillation, and second level distillation. It's easy to sort or filter the information to see only the data you wish to see.
Tools like InContext's CDTools were built for managing this sort of data - among other things. Consider using that if you need to keep this information electronically and share it with others.
A simple open source utility called Card Wall Generator authored by Jeremy Stell-Smith takes hierarchical spreadsheet data and uses it to generate a card wall in Microsoft's Visio. Viso works well to reprint the affinity diagram at actual size an put it back up on the wall should you need to walk the wall again in the future.
Prepare a presentable electronic distillation
The primary goal of this activity was to distill a large amount of data across a number of sources. The result can be ideal for presenting back to others.
When preparing an electronic summary of the data decide on the level of detail you wish go down to. You may choose to only describe categories, and major ideas in each category - the blue stickies. You may choose to go down one additional level to the pink stickes. You likely won't need to go down to the lowest level. It isn't much of a distillation any more if you do.
In your distillation also include the goal statement for the activity, the data sources you consolidated to produce this information, model photos taken during the model building session, and a reference to a model movie that was made.
Leveraging this model
This sort of model can be leveraged as foundational information for a variety of future design efforts. Keep the whole model handy, or an electronic distillation of it available for use during design meetings. With the full size model displayed, it's valuable to walk the length of the model and use stickies and ad hoc model annotation to attach your design ideas.
Knit in new data
You may likely add more data to this model over time. Adding data to an existing model should follow roughly the same approach as creating a new model. Prepare the new data, schedule a collaborative modeling session, and knit the new data in. In the process of knitting it in, you'll find you add new clusters and split existing clusters. This will create change to the pink and blue notes distilling the information. Be ready for and celebrate this change. You're learning more.
comment on this page via email ![]()
Next Topic: Identifying software value and objectives >>
<< Previous Topic: Acquire information using interviews