OPLA Psychological Safety Survey: Sector Research for Ontario's Public Libraries
A province-wide survey of public library staff workplace experience, adapted from Guarding Minds at Work and aligned with ISO 45003:2021. 1,236 responses, 13 psychosocial factors, one volunteer committee.
I have served on the Ontario Public Library Association's Research and Evaluation Committee (OPLA R&E) since 2023. The committee is a volunteer body of public library staff, researchers, and sector partners that designs and runs studies on questions the libraries themselves want answered. This case study is about one of those studies: the 2024 Employee Psychological Safety Survey, the largest workplace-experience survey OPLA has run.
I want to be precise about my role. I am one of several committee members who contributed to this work. I co-designed survey instruments, supported data cleaning and analysis, helped synthesize results into accessible key messages, and helped present early findings at the OLA Super Conference. I did not lead the project. The methodological choices below are committee decisions; my contribution was to argue for some of them and to do the work that followed. The arguments I won are not necessarily the ones I am proudest of, and the ones I lost are sometimes the ones that taught me the most about how a research committee actually decides anything.
The problem
Public library staff in Ontario do work that looks calm from the outside and is not. Reference desks absorb housing crises, mental health crises, and behavioural incidents that the rest of the social safety net is no longer catching. Programming staff plan for unpredictable attendance. Branch managers run buildings that are simultaneously a library, a daytime shelter, and a public computing lab. The sector knew anecdotally that staff wellbeing was strained. There was no provincial-scale evidence base.
The committee's question was narrow: across Ontario's public libraries, what does psychological safety at work actually look like, and where are the pressure points sharp enough that boards and CEOs should act on them?
Why this approach
Two design choices set the shape of the project.
First, the instrument. Rather than write a custom survey from scratch, the committee adapted Guarding Minds at Work, a Canadian instrument developed at the Centre for Applied Research in Mental Health and Addiction. Guarding Minds is aligned with ISO 45003:2021 (the international standard on psychological health and safety at work) and is structured around 13 psychosocial factors: Psychological Support, Organizational Culture, Clear Leadership and Expectations, Civility and Respect, Psychological Job Demands, Growth and Development, Recognition and Reward, Involvement and Influence, Workload Management, Engagement, Balance, Psychological Protection, and Physical Safety. Using an established framework meant we could compare findings against existing benchmarks rather than defending a homegrown taxonomy.
Second, the sampling. The survey ran in two waves. Wave one (September to October 2024) went out through 52 participating library systems, who distributed it internally to staff. That wave produced 1,022 responses. Wave two (November 2024 through January 2025) was distributed via OLA membership channels to capture staff at libraries that had not opted in to the first wave. That added 214 responses, for a total of 1,236.
The two-wave design was a tradeoff. A single channel through library systems would have given cleaner denominators and clearer response rates per branch. The OLA-membership wave broadened reach at the cost of knowing exactly which population the second-wave respondents represent. The committee took that tradeoff knowingly because the alternative was excluding entire library systems that had not formally opted in.
How the instrument was built
The questionnaire mixed a 5-point Likert agreement scale (Strongly Disagree to Strongly Agree) with four yes/no items on specific behaviours, plus demographic questions on role, tenure, library system size, region, and identity dimensions.
A few methodological calls worth naming, because they shape how the results read:
No "Skipped" option on Likert items. Respondents who did not want to answer simply did not answer. The committee deliberately did not offer a "Prefer not to say" Likert response because adding that option tends to inflate it as a low-effort default. The cost is more missing data per item; the benefit is that the responses we do get are considered.
Pairwise deletion for missing responses. When computing per-factor scores, we used pairwise deletion rather than listwise. A respondent who answered 11 of the 13 factors still contributes to those 11. Listwise deletion would have thrown away a meaningful share of usable data given the survey's length. The cost is that different factors have slightly different denominators; we report N per factor in the appendix rather than papering over it. The risk worth naming is that respondents who skip late items in a long survey are not skipping at random. The factors that sit late in the instrument may show systematically different denominators from the factors near the start, and that itself is a finding rather than a nuisance. We did not adjust for it; a future round should.
Five-point scale, no neutral midpoint by design intent. We kept the conventional midpoint ("Neither Agree nor Disagree") because removing it forces a stance respondents may not actually have, which produces noisier data than a clearly-labelled neutral. We accept that the midpoint absorbs some genuine ambivalence.
Anonymity over linkability. Because respondents are public-sector employees, some of whom work in branches small enough that any demographic crosstab risks re-identification, the committee suppressed cells under a minimum count threshold in published outputs. That meant some demographic intersections we would have liked to report could not be.
What the data said
The 13 factors stratify cleanly into a top tier and a bottom tier. The numbers below are the share of respondents agreeing or strongly agreeing with the positively-framed factor statements.
Top of the distribution:
- Psychological Protection: 83 percent
- Physical Safety: 82 percent
- Engagement: 80 percent
Bottom of the distribution:
- Organizational Culture: 47 percent positive (a majority of respondents do not agree their organization's culture supports their psychological wellbeing)
- Involvement and Influence: 57 percent
- Recognition and Reward: 58 percent
The top-tier reading is broadly positive: staff feel physically safe, feel free to express themselves at work without fear of negative consequences, and are engaged in their roles. The bottom-tier reading is the headline for the sector. Culture, voice, and recognition are the three factors where the gap between "I am here and I do the work" and "the organization sees me and shapes itself around what I bring" is widest.
Two demographic patterns stood out and are worth flagging because they push against intuition.
Burnout was highest among managers and librarians, not frontline staff. The conventional read of frontline library work is that the public-facing roles bear the heaviest psychological load. The data say otherwise: it is the supervisory and credentialed roles that show the strongest burnout signals. Two interpretations are plausible (management absorbs upward and downward pressure simultaneously; credentialed staff have higher self-imposed standards). The data does not adjudicate between them. This was the finding the committee discussed for the longest time before agreeing on language, because it sits sideways to how libraries are usually funded and how training money is usually directed. The instinct in any sector is to fund the most-visible role hardest. A finding that the supervisory tier is where the burnout pressure lands is not a comfortable thing to put in a sector report, but it is what the data showed.
Staff with 4 to 5 years of tenure are underrepresented in the sample. The tenure distribution has a visible dip at that band. The committee's working hypothesis is that this corresponds to the 2019 to 2020 hiring freeze across many Ontario library systems during the early pandemic period: the cohort that would have been hired then was not. If correct, it implies the sector will see a structural gap in mid-career staff for the next several years and should plan around it.
What I would do differently
A few methodological things I would push for on the next round.
A shorter instrument with rotation. The survey is long. Long surveys produce attrition and item-fatigue noise toward the end. A planned-missingness design (each respondent sees a random subset of factor items) would give us comparable per-factor coverage with less respondent burden. The cost is statistical complexity in analysis; the benefit is cleaner data per item.
Pre-register analyses. This was a descriptive study, not a hypothesis test, but the line is thinner than it sounds. Naming, in advance, which subgroup comparisons are confirmatory and which are exploratory would have saved some downstream interpretation arguments.
Plain-language reporting alongside the technical report. We did this for the Super Conference presentation. It should have been planned as a deliverable from week one, not added at the end.
How this informs my day job
I work on KPI dashboards and outcome reporting at metricHEALTH. The statistical surface is recognizable: Likert-style satisfaction items, demographic stratification, missing-data handling, and the constant tension between what the data wants to say and what a small-cell suppression rule will let you publish. The OPLA work was, in effect, the same problem in a different sector with different stakes. The instinct it sharpened most was the one about plain-language reporting: a finding that nobody can act on is not a finding, regardless of how cleanly it sits in the appendix. That is the bar a sector report has to clear, and it is the bar a clinical KPI dashboard has to clear too.
The other transferable habit is the small-cell suppression rule itself. In a healthcare reporting context, a frontend that proudly displays a 100 percent satisfaction rate computed over three respondents is a worse artifact than one that hides the cell entirely. The OPLA committee held that line carefully: better to leave a cell blank than to give a board chair a number they will quote in a meeting and that does not survive the next sample. I bring the same instinct to dashboard design at work, and it is one of the few opinions I will defend to a product manager who wants every cell populated.
The committee continues. The next round of OPLA research is in scoping as of early 2026, and I am still on the committee.