- Research article
- Open Access
Accuracy of cervical cytology: comparison of diagnoses of 100 Pap smears read by four pathologists at three hospitals in Norway
BMC Clinical Pathology volume 17, Article number: 18 (2017)
Cervical cancer can be prevented by early detection and treatment for precancerous lesions. Since 1995, there has been a national cervical cancer screening program in Norway, where women aged 25–69 years are recommended to take Pap smears every three years. There are 17 cytology laboratories covering a population of 5 million people. The detection rate of cervical abnormalities varies from laboratory to laboratory. We wanted to investigate the accuracy of cytology diagnoses by four different pathologists at three different hospitals in Norway.
One hundred Pap smears (20 Normal, 20 ASC-US, 20 LSIL, 20 ASC-H and 20 HSIL) screened at UNN in 2015 were evaluated by four pathologists at three hospitals in Norway. All patients were followed up through December 2016. Histologically confirmed high-grade dysplasia (CIN2+) was considered as study endpoint.
The number of Pap smears evaluated as abnormal (ASC-US+) by the four pathologists varied from 61 to 85. The number of high-grade cytology (ASC-H+) varied from 26 to 50. There was moderate agreement (weighted kappa 0.45–0.58) between the observers. There were 32 women with high-grade histology (CIN2+) in the follow-up, including 19 CIN2, 12 CIN3 and one squamous cell carcinoma (SCC). Using high-grade cytology (ASC-H+) as cut-off, the sensitivity for CIN2+ varied from 68.8% to 93.8% (mean 77.4%) and specificity from 70.6% to 95.6% (mean 81.3%). The pathologist with the highest sensitivity for CIN2+ had the highest false positive rate and the lowest specificity (p<0.05). The accuracy for CIN2+ varied from 74.1% to 83.8% (mean 79.4%). The Pap smear from the woman with cervical cancer was diagnosed as high-grade (ASC-H+) by one of the four pathologists.
Cervical cancer screening based on cytology has limited accuracy. The study revealed a moderate agreement between the observers, along with a trade-off between sensitivity and specificity. This might indicate that hospitals with high detection rates of cervical cytology have higher sensitivity for CIN2+ but lower specificity.
Cervical cancer is caused by human papillomavirus (HPV) and develops over many years through a series of precancerous steps [1, 2]. The disease can be prevented by using the HPV vaccine or by screening with HPV test or Pap smears [3, 4]. Since 2009, there has been a HPV vaccination program for 12-year-old girls in Norway. The program’s coverage is around 80% . Since November 2016, there has been an ongoing two-year catch-up vaccination program for 20–25 years old women where the expected coverage rate is 40–45% (www.fhi.no). Since 2015, there has been a pilot for HPV testing in primary screening in four counties . In this pilot, women 34 years and older are randomized to Pap smear every three years or HPV test every five years . However, in most parts of Norway, the cervical screening program is still based on cervical cytology .
Since 1995, there has been a national cervical cancer screening program in Norway, where women aged 25–69 years are recommended to take Pap smears every three years . Women with high-grade cytology (ASC-H / HSIL) are referred to a gynecologist for colposcopy and biopsy. HPV test is used in triage of women with low-grade cytology (ASC-US / LSIL). The cervical screening program has a coverage of 60% after 3.5 years. The Norwegian Cancer Registry sends a reminder to women without a Pap smear after three years and a new reminder after four years. The coverage is 80% after 5 years . Most Pap smears are taken by GPs, while some samples are taken by gynecologists. There are 17 different laboratories involved in the screening program, and most of these use liquid-based cytology (ThinPrep or SurePath).
It is well known that cervical cytology has limited sensitivity and reproducibility [7,8,9,10,11,12]. Diagnoses may vary from cytotechnician to cytotechnician, from pathologist to pathologist and from lab to lab [9, 11, 12]. All cervical cytology diagnoses, results of HPV tests and biopsies from all laboratories in Norway are reported to the Norwegian Cancer Registry, which drafts annual reports with feedback to each laboratory, including the distribution of their diagnoses compared with the national average  (Table 1).
There is a high variability in detection rates across hospitals. This may be due to higher sensitivity, lower specificity, differences in HPV prevalence, cervical dysplasia and cancer in some parts of the country compared to other parts of the country, or a combination of these causes. We wanted to investigate the accuracy of cytology diagnoses by four different pathologists at three different hospitals in Norway.
One hundred cervical cytological samples screened at UNN in 2015 with the diagnoses normal, ASC-US, LSIL, ASC-H and HSIL were sent to the Departments of Pathology in Bergen (HUS), Bodø (Nordland), Fredrikstad (Østfold), Stavanger (SUS) and Tønsberg (Vestfold). The pathologist at the Department of Pathology in Bergen did not have time to participate in the study, and he forwarded the slides to Stavanger without looking at them. Two cytotechnologist at the Department of Pathology in Fredrikstad diagnosed the slides, but they were trained to screen SurePath samples. Their results were therefore excluded from this study based on ThinPrep samples.
All slides were first screened by a cytotechnologist at UNN and then evaluated by a pathologist at UNN (P1, reference). The abnormal cells were marked on the slides before being dispatched for the study. The slides were not screened at the other hospitals. The four other pathologists (P2–P5) at other hospitals were to only evaluate the abnormal cells marked on the slides. The other pathologists were blinded for age, previous findings, clinical information and HPV result. Diagnoses from each of the four pathologists were compared with diagnoses from the three other pathologists. Women with abnormal findings at UNN were followed up according to national guidelines. In Norway, the Bethesda System for Reporting Cervical Cytology is used by all laboratories. All patients were followed up through December 2016. Histologically confirmed high-grade dysplasia (CIN2+) was considered as study endpoint (gold standard). When calculating the sensitivity and specificity, women with normal Pap smears, and women with low-grade cytology (ASC-US / LSIL) and negative HPV test without histology, were considered free of high-grade dysplasia (CIN1-).
All analyses were done in IBM SPSS Statistics, version 23, with Chi-square test for categorical variables and t-test for continuous variables. For accuracy of cytological diagnoses between different observers, we used weighted kappa with linear weights.
Of the 100 cervical cytology samples, 20 were diagnosed Normal, 20 ASC-US, 20 LSIL, 20 ASC-H and 20 HSIL at UNN. There were 32 women with high-grade histology (CIN2+) in the follow-up, including 19 CIN2, 12 CIN3 and one squamous cell carcinoma (SCC). There were no CIN2+ in women with Normal diagnosis, one CIN2+ in women with ASC-US diagnoses, three women with CIN2+ in the LSIL group, 10 CIN2+ in the ASC-H group and 18 CIN2+ in the HSIL group (Table 2). Using high-grade cytology (ASC-H+) as cut-off, the sensitivity for CIN2+ at UNN was 87.5% (28/32).
The number of samples diagnosed as “Normal” varied from 15 to 39 by the four pathologists, with a mean of 28.8. One pathologist (P2) had significantly fewer “Normal” cases than the average of the four pathologists (p<0.05) (Table 3). The corresponding variation of ASC-US, LSIL, ASC-H and HSIL were 17 to 24 (mean 19.8), 9 to 20 (mean 14.0), 10 to 18 (mean 13.3) and 16 to 32 (mean 24.0), respectively (Table 3), none of which were significant. There was moderate agreement between the observers (weighted kappa 0.45–0.58) (Table 4). The kappa statistics were not statistically different.
The agreement of the different diagnoses was higher for “Normal” and “HSIL” samples than the other diagnoses (ASC-US, LSIL and ASC-H) (Additional file 1: Tables S1–S5). The number for high-grade cytology (ASC-H+) varied from 26 (P4) to 50 (P2). Of 61 women with at least one high-grade cytology, 17 samples (27.9%) were considered high-grade by all four observers (Additional file 1: Figure S1). The number of true positive (CIN2+) using ASC-H+ as a cut-off varied from 22 to 30 (mean 24.8) (Additional file 1: Figure S2 and Table 5). The corresponding sensitivity for CIN2+ varied from 68.8% to 93.8% (mean 77.4%). One pathologist (P2) had significantly higher sensitivity than the average of the four pathologists (p<0.05) (Table 5). Of 32 women with CIN2+, 15 samples (46.9%) were considered high-grade by all four observers (Additional file 1: Figure S2). One woman with CIN2 was not considered to have high-grade cytology by any of the four observers (patient 57, Additional file 1: Table S3). The number of true negative (CIN1-) using LSIL- as a cut-off varied from 48 to 65 (mean 55.3). The corresponding specificity ranged from 70.6% to 95.6% (mean 81.3%) (Table 5). One pathologist (P2) had significantly lower specificity and one pathologist (P4) had significantly higher specificity than the average of the four pathologists (p<0.05) (see Table 5). The pathologist (P2) with the highest sensitivity for CIN2+ had the highest false positive rate and the lowest specificity (Table 5). The accuracy for CIN2+ varied from 74.1% to 83.8% (mean 79.4%). There were no statistically significant differences in accuracy (Table 5). The Pap smear from the woman with cervical cancer (SCC) was diagnosed as high-grade (ASC-H+) by one of the four pathologists (P2), while three pathologists diagnosed her as ASC-US (Additional file 1: Table S5). The woman had a positive HPV test for HPV type 16 (data not shown).
The study’s purpose was to investigate the accuracy of cytology diagnoses by four different pathologists at three hospitals using 100 Pap smears with different cytological diagnoses screened at UNN. The agreement of the cytological diagnoses between the four pathologists in this study was “moderate.” A moderate agreement is better than “fair,” but worse than “substantial.” The kappa statistics were not statistically different.
In Norway there are 17 cytology laboratories covering a population of 5 million people . All the laboratories receive most of their samples from general practitioners in primary screening. The population in Norway is quite homogenous, where Norwegian women in the different parts of Norway are mostly the same. The differences between the various laboratories are probably caused by different interpretation of the Bethesda criteria. Two pathologists (P4 and P5) were from the same laboratory but still used very different diagnoses for the same patients.
In the ATHENA study, the sensitivity of cytology varied from 42.0% to 73.0% . In our study, the sensitivity for CIN2+ varied from 68.8% to 93.8%, but all the smears were first screened at the same hospital, and abnormal cells were marked on the slide. It is easy to find abnormal cells on a slide full of marks. In a population with a given prevalence of CIN2+, the sensitivity of cytology is dependent on the detection rate. In the ATHENA study, the positivity rate of cytology in primary screening varied from 3.8% to 9.9% while the detection rate of HPV DNA test (Cobas 4800) varied from 10.9% to 13.4% . In our study, the detection rate of high-grade cytology (ASC-H / HSIL) varied from 26.0% to 50.0%, while the detection rate of HPV DNA test (Cobas 4800) was 74.3% (52/70).
In our study, the accuracy varied from 74.1% to 83.8% (mean 79.4%). In five published studies the accuracy varied from 64.2% to 78.4% (mean 76.1%) (Table 6). There was less variation between the four pathologists in our study than between the five published studies. The mean accuracy of the four pathologists in our study was significantly higher than the mean of the five published studies (79.4% vs 76.1%, p<0.05).
There is a trade-off between sensitivity and specificity in cervical cancer screening. In our study the pathologist with the significantly highest sensitivity for CIN2+ had the significantly lowest specificity. In general, laboratories with a high detection rate of cytology also have higher sensitivity for CIN2+. If the sensitivity is higher, the hospital detects more women with CIN2/3 that can be treated, and fewer women develop cervical cancer before the next screening round. When women with low-grade cytology (ASC-US / LSIL) are triaged with HPV test, a high detection rate of low-grade cytology should not be considered as a major problem. A false positive ASC-US will have a negative HPV test and does not need follow-up. A false negative “Normal” cytology has no indication for HPV testing, according to Norwegian guidelines (www.kreftregisteret.no).
Cytology is subjective with poorly reproducible criteria. HPV testing is more objective with strictly defined criteria. Co-testing with both cytology and HPV test may reduce the risk of false negative cytology when the pathologists take the HPV result in consideration when evaluating the cytological slide. In our study, only the observer at UNN (P1, reference) knew the HPV result. All other observers were blinded for clinical information and HPV result, which might explain the lower sensitivity for CIN2+ for some of the other pathologist. Originally, in the ATHENA study, cytology was reviewed blinded to HPV status. When the same slides were re-reviewed unblinded to HPV status, the sensitivity for CIN3+ of co-testing increased from 54.1% to 62.4% (P = 0.0015) . In our study, the mean sensitivity for CIN2+ for the four external pathologists was 77.4% based on slides screened at the same hospital.
The present study also has other weaknesses. For P1 the diagnoses were set in normal routine work, while the cases for the other four pathologists had to be diagnosed in addition to normal workload. This might affect the interpretation. In addition, only P1 at UNN had access to the initial diagnoses suggested by the cytotechnician. In daily practice the pathologist usually compares his or her initial impression with the diagnosis suggested by the cytotechnician. If there is discrepancy, the slide is reviewed. This might explain the lower sensitivity of some of the other pathologist. In normal routine work, difficult cases will be discussed with other pathologists. In this study, the pathologists reviewed all the slides alone.
Out of the 100 women in this study, there was one woman with cervical cancer. Three of the four pathologists diagnosed her cytology as ASC-US. According to Norwegian guidelines, women with ASC-US and a positive HPV result should be followed up with a new cytology and HPV test after 6–12 months. Only women with persistent HPV infection should be referred to a gynecologist for colposcopy and biopsy (www.kreftregisteret.no). This may delay diagnosis, treatment and worsen her prognosis.
There were statistically significant differences in sensitivity and specificity (p<0.05) for CIN2+ between the observations, but not in accuracy. In a low resource setting, specificity is important to reduce colposcopy workload. In a high resource setting like Norway, sensitivity is more important to reduce the number of cervical cancer. Specificity of cytology can be improved by HPV test in a triage of ASC-US / LSIL. The costs of a high number of HPV tests are of minor importance in a high resource setting. In the USA, co-testing (cytology and HPV test) every five years is recommended for women 30–60 years of age [10, 14].
Cervical cancer screening based on cytology has limited accuracy. The study revealed a moderate agreement between the observers, along with a trade-off between sensitivity and specificity. This might indicate that hospitals with high detection rate of cervical cytology have higher sensitivity for CIN2+, but lower specificity.
Atypical squamous cells – cannot exclude HSIL
Atypical squamous cells of undetermined significance
Cervical intraepithelial neoplasia, also known as cervical dysplasia
- CIN1, CIN2, CIN3:
Cervical intraepithelial neoplasia grade 1, 2 or 3, also known as low-grade, moderate or severe cervical dysplasia
- CIN2+, CIN2, CIN3:
Adenocarcinoma in situ (ACIS) or cervical cancer
- HPV DNA test:
Cobas 4800 detects DNA from 14 high-risk HPV types (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66 and 68) at clinically relevant infection levels
High-grade squamous intraepithelial lesion
Low-grade squamous intraepithelial lesion
Negative predictive value
- Pap smear:
the Papanicolaou test, also known as Pap test, cervical smear or cervical cytology
Positive predictive value
The World Health Organization.
Munoz N, Bosch FX, de SS, Herrero R, Castellsague X, Shah KV, et al. Epidemiologic classification of human papillomavirus types associated with cervical cancer. N Engl J Med. 2003;348:518–27.
Walboomers JM, Jacobs MV, Manos MM, Bosch FX, Kummer JA, Shah KV, et al. Human papillomavirus is a necessary cause of invasive cervical cancer worldwide. J Pathol. 1999;189:12–9.
de Blasio BF, Neilson AR, Klemp M, Skjeldestad FE. Modeling the impact of screening policy and screening compliance on incidence and mortality of cervical cancer in the post-HPV vaccination era. J Public Health (Oxf). 2012;34:539–47.
Saslow D, Solomon D, Lawson HW, Killackey M, Kulasingam SL, Cain J, et al. American Cancer Society, American Society for Colposcopy and Cervical Pathology, and American Society for Clinical Pathology screening guidelines for the prevention and early detection of cervical cancer. Am J Clin Pathol. 2012;137:516–42.
Skare GB, Lonnberg S, Bjorge T, Trope A: The Norwegian cervical cancer screening programme. Annual report 2015. The Cancer Registry of Norway 2016.
Andreassen T, Vogt C. Screening for cervical cancer--future perspectives. Tidsskr Nor Laegeforen. 2014;134:1122–3.
Arbyn M, Buntinx F, Van Ranst M, Paraskevaidis E, Martin-Hirsch P, Dillner J. Virologic versus cytologic triage of women with equivocal Pap smears: a meta-analysis of the accuracy to detect high-grade intraepithelial neoplasia. J Natl Cancer Inst. 2004;96:280–93.
Arbyn M, Roelens J, Simoens C, Buntinx F, Paraskevaidis E, Martin-Hirsch PP et al.: Human papillomavirus testing versus repeat cytology for triage of minor cytological cervical lesions. Cochrane Database Syst Rev 2013, 3: CD008054.
Bigras G, Wilson J, Russell L, Johnson G, Morel D, Saddik M. Interobserver concordance in the assessment of features used for the diagnosis of cervical atypical squamous cells and squamous intraepithelial lesions (ASC-US, ASC-H, LSIL and HSIL). Cytopathology. 2013;24:44–51.
Katki HA, Kinney WK, Fetterman B, Lorey T, Poitras NE, Cheung L, et al. Cervical cancer risk for women undergoing concurrent testing for human papillomavirus and cervical cytology: a population-based study in routine clinical practice. Lancet Oncol. 2011;12:663–72.
Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL Triage Study. JAMA. 2001;285:1500–5.
Wright TC Jr, Stoler MH, Behrens CM, Sharma A, Sharma K, Apple R. Interlaboratory variation in the performance of liquid-based cytology: insights from the ATHENA trial. Int J Cancer. 2014;134:1835–43.
Wright TC Jr, Stoler MH, Aslam S, Behrens CM. Knowledge of Patients' Human Papillomavirus Status at the Time of Cytologic Review Significantly Affects the Performance of Cervical Cytology in the ATHENA Study. Am J Clin Pathol. 2016;146:391–8.
Massad LS, Einstein MH, Huh WK, Katki HA, Kinney WK, Schiffman M, et al. updated consensus guidelines for the management of abnormal cervical cancer screening tests and cancer precursors. Obstet Gynecol. 2012;2013(121):829–46.
Dillner J, Rebolj M, Birembaut P, Petry KU, Szarewski A, Munk C, et al. Long term predictive values of cytology and human papillomavirus testing in cervical cancer screening: joint European cohort study. BMJ. 2008;337:a1754.
Castle PE, Glass AG, Rush BB, Scott DR, Wentzensen N, Gage JC, et al. Clinical human papillomavirus detection forecasts cervical cancer risk in women over 18 years of follow-up. J Clin Oncol. 2012;30:3044–50.
Szarewski A, Ambroisine L, Cadman L, Austin J, Ho L, Terry G, et al. Comparison of predictors for high-grade cervical intraepithelial neoplasia in women with abnormal smears. Cancer Epidemiol Biomarkers Prev. 2008;17:3033–42.
Sorbye SW, Arbyn M, Fismen S, Gutteberg TJ, Mortensen ES. Triage of women with low-grade cervical lesions--HPV mRNA testing versus repeat cytology. PLoS One. 2011;6:e24083.
This study was initiated by Gry Andersen, Director, Division of Diagnostic Services, University Hospital of North Norway. We want to thank Lars Uhlin-Hansen and Liv Hansen, at the Department of Clinical Pathology at University Hospital of North Norway, who were involved with the study’s design. We are grateful Teresa Grid, Renate Veronica Hansen and the other pathologists, cytotechnicians and staff members at the departments of pathology in Bodø, Fredrikstad, Stavanger, Tromsø and Tønsberg who made this study possible. We also want to thank Frode Skjold for his assistance with the statistics.
Availability of data and materials
All data generated or analyzed during this study are included in this published article and supplementary tables.
Ethics approval and consent to participate
The Regional Committee for Medical and Health Research Ethics, North Norway, has approved the protocol as a quality assurance study in laboratory work fulfilling the requirements for data protection procedures within the department (REK Nord 2014/787). Norwegian regulations exempt quality assurance studies from written informed consent from the patients (https://lovdata.no/dokument/SF/forskrift/2000-12-15-1265).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Table S1.
Diagnoses per pathologist (P2–P5) in samples with Normal cytology at UNN. Table S2. Diagnoses per pathologist (P2–P5) in samples with ASC-US cytology at UNN. Table S3. Diagnoses per pathologist (P2–P5) in samples with LSIL cytology at UNN. Table S4. Diagnoses per pathologist (P2–P5) in samples with ASC-H cytology at UNN. Table S5. Diagnoses per pathologist in samples with HSIL cytology at UNN. Figure S1. Distribution of high-grade cytology (ASC-H+) diagnoses by observer (P2–P5) in women with at least one high-grade cytology (N=61). Figure S2. Distribution of high-grade cytology (ASC-H+) diagnoses by observer (P2–P5) in women with histological CIN2+ in follow-up (N=32). (PDF 100 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Sørbye, S.W., Suhrke, P., Revå, B.W. et al. Accuracy of cervical cytology: comparison of diagnoses of 100 Pap smears read by four pathologists at three hospitals in Norway. BMC Clin Pathol 17, 18 (2017). https://doi.org/10.1186/s12907-017-0058-8
- High-grade Cytology
- National Cervical Cancer Screening Program
- Weighted Kappa
- Cytological Diagnosis
- High Detection Rate