Skip to main content

Validation of the Postgraduate Hospital Educational Environment Measure (PHEEM) in a sample of 731 Greek residents



The Greek version of the Postgraduate Hospital Educational Environment Measure (PHEEM) was evaluated to determine its psychometric properties, i.e., validity, internal consistency, sensitivity and responsiveness to be used for measuring the learning environment in Greek hospitals.


The PHEEM was administered to Greek hospital residents. Internal consistency was measured using Cronbach’s alpha. Root Mean Square Error of Approximation (RMSEA) was used to evaluate the fit of Structural Equation Models. Content validity was addressed by the original study. Construct validity was tested using confirmatory (to test the set of underlying dimensions suggested by the original study) and exploratory (to explore the dimensions needed to explain the variability of the given answers) factor analysis using Varimax rotation. Convergent validity was calculated by Pearson’s correlation coefficient regarding the participant’s PHEEM score and participant’s overall satisfaction score of the added item “Overall, I am very satisfied with my specialization in this post”. Sensitivity was checked by comparing good versus poor aspects of the educational environment and by satisfied versus unsatisfied participants.


A total of 731 residents from 83 hospitals and 41 prefectures responded to the PHEEM. The original three-factor model didn’t fit better compared to one factor model that is accounting for 32 % of the variance. Cronbach’s α was 0.933 when assuming one-factor model. Using a three-factor model (autonomy, teaching, social support), Cronbach’s α were 0.815 (expected 0.830), 0.908 (0.839), 0.734 (0.793), respectively. The three-factor model gave an RMSEA value of 0.074 (90 % confidence interval 0.071, 0.076), suggesting a fair fit. Pearson’s correlation coefficient between total PHEEM and global satisfaction was 0.765. Mean question scores ranged from 19.0 (very poor) to 73.7 (very good), and mean participant scores from 5.5 (very unsatisfied) to 96.5 (very satisfied).


The Greek version of PHEEM is a valid, reliable, and sensitive instrument measuring the educational environment among junior doctors in Greek hospitals and it can be used for evidence-based SWOT analysis and policy.


Besides providing health services to the public, one of the most important targets of the health system is to train physicians who will provide these services. There are differences concerning the way trainers and trainees understand the perfect training environment [1, 2]. In addition, a major concern is the fact that training differs significantly not only between health systems of different countries due to cultural variations [35], but also between hospitals in the same national Health system if not even among departments of the same hospital.

It is of great importance to evaluate the quality of provided training in order to take corrective measures towards training improvement and to use failure events to improve work process [6]. The existence of an instrument that evaluates the quality of training programs in every day clinical practice is a step toward training perfection [3]. Continuing efforts targeting training improvements led to development of many instruments, created and validated in different countries [7]. These instruments include procedures for undergraduate medical students, such as DREEM [8], and instruments for various medical specialties, such as, anesthesiology, ATEEM [9], surgery, STEEM/OREEM [10], and ambulatory service, ACLEEM [11]. Further, there exists a generic instrument for the assessment of the educational environment of all junior doctors in hospitals, the Postgraduate Hospital Educational Environment Measure (PHEEM) [26], developed and validated by the Centre of Medical Education of the University of Dundee, UK, and being used worldwide [1224].

PHEEM has been already translated and linguistically validated in Greek [25] but has not been psychometrically validated, considering the structural and cultural differences that may exist in the Greek national health care system [26, 27]. The aim of this study was to validate the translated PHEEM in the learning environment for junior doctors in Greek hospitals.


Ethical approval

The study was approved by the Medical Board of the University of Patras, and carried out in compliance with the Helsinki Declaration.

The instrument

The original English PHEEM instrument consists of 40 items, 36 positive and 4 negative statements on a scale from “strongly disagree” to “strongly agree”, scored 0–4 on a five-point Likert scale (after reversing the negative ones), grouped into three subscales for perceptions of role autonomy, teaching, and social support [9]. Its Greek translation [25] was used for this psychometric validation. The original open-ended “Comments” was replaced by two specific ones, “If you could change one thing in this position what would it be” and “What would you not change”, after Whittle et al. [28] . To assess the comprehensibility and the cultural relevance of the items, the Greek version of PHEEM was tested on a group of 8 physicians. Based on this cognitive debriefing, some questions had to be slightly rephrased.

Data collection

The questionnaire was initially distributed from January 2011 until February 2012 in paper form directly by the researchers to a convenient sample of doctors in specialty training programs in a broad selection of hospital departments and health care centers in West Greece. In November 2012 the questionnaire went online on Google Drive platform, emailing as much as possible residents in Greece, based on the records and the email addresses at the prefectural medical associations. Interns were asked to indicate, regarding their current training situation, their agreement with the statements using six options (strongly disagree, disagree, rather disagree, rather agree, agree, strongly agree). The four negative statements were scored in reverse so that in all items the higher the score the better the environment. Information on gender, age, hospital, residency year and specialty were also included. The participation in the study was voluntary and anonymous.

Data analysis

Four instrument properties, namely reliability, validity, sensitivity (ability to detect differences between participants or groups of participants) and responsiveness (ability to detect changes when a participant improves or deteriorates) [29] should be tested. With this data we tested all except the last that was beyond the scope of this study. Mean scores for each item and domain and for the total instrument were calculated. For international comparison all scores are given in a 0–100 (%) scale, after converting the original 1- to 6-point scale, where 1 = 0, 2 = 20, 3 = 40, 4 = 60, 5 = 80 and 6 = 100 [30].


Cronbach’s alpha was used to measure internal consistency and unidimensionality of each set of items that refer to each of the factors [31]. Cronbach’s α higher than 0.7 shows acceptable (0.7–0.8), good (0.8–0.9) or excellent (>0.9) internal consistency; value >0.7 shows questionable (0.6–0.7), poor (0.5–0.6) or unacceptable (< 0.5) internal consistency [29]. Given the total scale alpha (αscale), expected subscale alphas were calculated with the Spearman–Brown formula, αsubscale = kαscale/(1 + (k−1)αscale), where k is the number of items of the subscale divided by the number of items of the total scale. Root Mean Square Error of Approximation (RMSEA) was used to evaluate the fit of Structural Equation Models; value of RMSEA smaller than 0.05 indicates very good fit, while values larger than 0.1 indicate poor fit, and intermediate values a fair fit.


Content validity was addressed by the original study [9]. During the whole validation process, we were careful to spot any content issues that may arise. The same purpose was served by changing the open ended “Comments” by the two specific items described in “The instrument” paragraph.

Construct validity was tested with Confirmatory (CFA) and Exploratory (EFA) Factor Analysis, and with the underlying variable approach (UVA). CFA was used to test whether the set of underlying dimensions suggested by the original study [26] is adequate to explain all inter-relationships among the 40 observed ordinal items in the sample of Greek medical interns. EFA was performed to explore the dimensions needed to explain the variability of the answers; items with loadings <0.4 were excluded [21]. Statistical Package for Social Sciences (SPSS) Version 20 (SPSS Inc., Chicago, IL, USA) was used for these analyses. Further exploration was performed to test if some of the underlying dimensions are strongly correlated and could be represented by a single construct (i.e., one factor). Based on the idea that the observed ordinal variables are generated by a set of underlying continuous random variables, several methods have been developed to conduct factor analysis for ordinal data, using univariate (frequencies) and bivariate (crosstabs) information [32, 33]. Thus, we conducted factor analysis using the UVA in the statistical package LISREL [34], assessing items to factors as suggested by the original paper [26].

Convergent validity We calculated Pearson’s correlation coefficient between participant’s PHEEM score and participant’s overall satisfaction score of the added item “Overall, I am very satisfied with my specialization in this post”. In addition, the total PHEEM mean score was compared with the total satisfaction mean score, both statistically, accepting a p < 0.05 as significant difference, and educationally, accepting a 5 % difference as educationally minimum important difference (EMID), according to the quality of life field [3537].


We are not aware of any educational differences among the Greek hospitals, in order to check the instrument’s ability to detect these differences. However, access to careers advice, counseling opportunities for failing residents, handbook for juniors etc. are not established in the educational environment in Greece; on the other hand, we expect no race or sex discrimination as well as good relation with colleagues; these expectations were tested comparing mean question scores. In addition, residents differ one from the other, and the same applies for teachers and posts, [1, 36]. Thus, it is reasonable to expect differences between the participants, ranging from very unsatisfied to very satisfied; this expectation was tested comparing the mean participant scores.



We obtained 731 completed questionnaires (190 in paper and 541 online) from 55 % male and 45 % female interns, aged 24–49 (mean 33, standard deviation 3.5), being trained in 33 out of the 40 specialties, with a mean training time of 3.1 (1.5) years in total and 2.5 (1.2) in the current post, in 83 out of the 128 hospitals from 41 (80 %) of the 51 Greek prefectures.


Cronbach’s α was 0.933 when assuming one-factor model (internal consistency of the total questionnaire). When we used a three-factor model, using autonomy, teaching and social support as factors, Cronbach’s α were 0.815 (expected 0.830), 0.908 (0.839), 0.734 (793), respectively. Observed autonomy and social support alphas were slightly less than expected but within the same interpretative zone, while the observed teaching alpha was higher than expected and within one upper interpretative zone.


Content validity

Soon after the electronic version was introduced, it was realized that there was a misconception with item seven (7) “There is racism in this post”. Several Greek residents interpreted this question as suggesting discrimination between the different specialties and not race discrimination; racism has gradually become a generic term in Greece meaning any discrimination. Therefore, the item 7 was further clarified to “There is racism (race discrimination) in this post”, a new item (41) “There is a discrimination in this post against some specialties” was added in the end of the questionnaire, and reliability and factor analyses were performed excluding the previous answers of the item seven. Furthermore, the two open-ended questions revealed important aspects that were not included in the original English version, such as easy and fast access to the internet, the way the specialty exams are carried out, and training in primary health care settings, in emergency settings, in outpatient and inpatient care.

Construct validity

We found very large correlations between the three factors: autonomy, teaching, social support (above 0.96 for all three pairwise combinations, data not shown) that suggest all three factors measure the same construct. The results using three-factor model and one-factor model are shown in Table 1. Loadings from the one-factor model are almost identical to the corresponding item loadings from the three-factor solution. The three-factor model gave an RMSEA value of 0.074 (90 % confidence interval 0.071, 0.076), suggesting a fair fit.

Table 1 Factor analysis (item loadings assuming three- and one-factor solutions, sorted by the one-factor total solution) and reliability (Cronbach’s alpha, last line) of our data

Employing the one-factor analysis model in SPSS gave identical results, suggesting that the ordinal items have metric properties. More specifically the one-factor explained 32 % of the total variance, whereas at least 7 factors were needed to explain 50 % of the total variance. Using three factors explained 42 % of the total variance. However, it is very difficult to associate items to factors even after trying various rotation methods. The inflexion point in the scree plot is very subjective (Fig. 1). Using as criterion to keep all those factors with eigenvalue higher than 1.5 [21] yields three factors. Keeping all factors with an eigenvalue higher than 1.0, which is one of the default SPSS options, gives 8 factors. Using as criterion to keep only factors that increase the percentage of variance explained by at least 5 % [21], results in two factors. Excluding items 1, 7–9, 11, 13, 20, 25 and 26 that have loadings <0.4, the percentage of variance explained increased to 38 %.

Fig. 1

Scree plot of the factorial analysis and eigenvalues associated with principal components

Convergent validity

Pearson’s correlation coefficient between participant total PHEEM score and participant overall satisfaction score was 0.765 (Fig. 2). The total PHEEM mean score (41.1 %) was statistically (p = 0.002) but not educationally (1.8 < 5 % = EMID) higher than the overall satisfaction mean score (39.3 %; Fig. 2, bottom).

Fig. 2

Correlation between participant global satisfaction and participant total PHEEM scores


Mean question scores varied from 19.0 to 73.7 % (Fig. 3), while mean subscale scores fluctuated much less (autonomy 38.6 %, teaching 41.7 %, social support 43.6 %). Mean participant scores (Fig. 4) varied from 5.5 % (very unsatisfied) to 96.5 % (very satisfied).

Fig. 3

Mean score of every single item, of the three subscales, of the total PHEEM and of global satisfaction. The questions are marked with the first letter of the subscale they belong to (a autonomy, t teaching, s social support) and their identification number, e.g., s19, a9, t22 etc

Fig. 4

Participant mean score, presenting the first participant with the highest score (participant with ID 345) and the last with the lowest score (ID 600) and every twentieth intermediate participant. Between two consequent bars other 19 participant are lying, except of the first two bars (18 in between) and the last two (10 in between)


The aim of this study was to validate the Greek version of PHEEM questionnaire in the working environment of the Greek National Health System hospitals, including participants from a wide range of hospitals and medical specialties. Validation of instruments is the process of determining whether there are grounds for believing that the instrument measures what it is intended to measure, and that it is useful for its intended purpose, by testing instrument’s reliability, validity, sensitivity and responsiveness [29].


Cronbach’s alpha of the total tool was higher than 0.90, indicating excellent internal consistency [27, 29]. This is very similar to the value of 0.91 [26]; 0.921 [2], 0.93 [12] and 0.92 [24]. However, Tavakol and Dennick [38] argue that a value of α > 0.90 may suggest redundancies and show that the test length should be shortened. There are at least three reasons for such high alphas: the correlation of the items on the scale (actual reliability), the length of the scale (the number of questions), and the length of the Likert scale (the number of response options). Response options, five in the original PHEEM, are six in this study and this adds to reliability [27]. The length of the scale might be an issue (see validity below). In any case, if we accept that other than actual reliability factors cause a 5 % or even 15 % increase in Cronbach’s alpha, it would still remain >0.80, indicating a very good reliability. Thus, we can conclude that our effort produced a reliable questionnaire.


Construct validity

We used two different factor analysis models. Firstly we used factor analysis assuming the three subscales that were originally identified as autonomy, teaching, and social support [21]. Secondly, we used factor analysis assuming only one subsequent factor. Our results show that the loadings of each item didn’t vary significantly across the two models. We also found that treating ordinal items as continuous did not have an effect on the magnitude of the loadings. The three-factor model didn’t fit better compared to the one-factor model, meaning that categorizing questions into three independent factors is unnecessary for assessing the Greek specialty training environment. The one-factor model is in concordance with Boor et al. [14], but differs from Clapham et al. [2] who suggest 10 factors.

Content validity

Based on the factor analysis, the following five questions with loadings <0.3 should be removed (Table 1): Bleeped inappropriately; junior handbook; catering facilities; racism; sex discrimination (six more items should be removed if the loading cut point was 0.4). However, the expert panel consensus (consisted by 3 consultants, PK, EJ, ID, and 2 residents, VK, SB) decided that these questions are important aspects of the specialty training, thus they were not removed from the tool. In addition, especially for the Greek culture, we split the racism into two items “There is racism (race discrimination) in this post” and “There is a discrimination in this post against some specialties” (the last question’s loading was 0.68, while the loadings of the other questions remained unchanged). Furthermore, the two open-ended questions revealed important aspects that were not included in the original English version, such as “I have easy and fast access to the internet at my workplace”, “I am satisfied with the way the specialty exams are carried out”, “My training in primary healthcare settings is sufficient”, “My training in emergency inside and outside the hospital is sufficient”, “My training in outpatient care is sufficient”, and “My training in inpatient care within the wards is sufficient”. These were incorporated in the final Greek version (Appendix). However, we think the time has come for a meeting of the original version constructors with all worldwide translators, validators and users, to discuss and conclude for a new PHEEM version. The things are subject to constant change, the world changes, and perception too. In order the PHEEM to remain alive, it should also change. Exactly as cars, computers, word processors and other do. The meanwhile accumulated experience should be incorporated into a new version, the PHEEM v.1.

Convergent validity

The high correlation (in the upper quartile of Pearson’s correlation coefficient) between participant total PHEEM score and participant overall satisfaction score is consistent with the tool’s convergent validity. The same conclusion could be driven from the no educationally important difference of the two total means, while their statistical difference might be due to the very large sample size. Thus, there is no evidence that the Greek PHEEM is not a convergent tool. The illustrated convergent validity (a question on overall satisfaction) may rather subsume the items of the entire questionnaire, but the PHEEM has the ability to reveal where exactly the problem lies and how big it is. Thus, overall satisfaction cannot substitute the PHEEM instrument, but then the PHEEM can.

Sensitivity and responsiveness

The low scores in items “access to careers advice”, “counseling opportunities for residents who fail”, “handbook for juniors”, and the high scores in “race or sex discrimination” and “good relation with colleagues” are as expected in Greece; i.e., the tool was sensitive enough to catch an existing situation. The same happened with the mean participant scores, varying from very unsatisfied till very satisfied; people differ among each other and the post environments are expected to differ as well; the tool was very sensitive to uncover this difference. Thus, we can conclude that the Greek PHEEM is a sensitive tool. Checking responsiveness was beyond the scope of this study. However, since sensitivity is a prerequisite for responsiveness [29], we can reasonably expect that the Greek PHEEM is also a responsive tool.


Using paper and electronic questionnaires might be a limitation; however, we calculated separate scores and we found no difference (39.6 vs. 41.7 %). Direct conclusions on responsiveness remain for a future work. Also, this instrument collects one stakeholder’s views, those of the trainees; perceptions of others—trainers, nurses, administrators, insurance, and of course patients—are missing; we need at least those of the two main players [39, 40]. Finally, we would like to emphasize—and this is a warning rather than a limitation—that this study focused on the validation of an instrument, not on measuring educational environment in Greek hospitals; thus, though they are based on a large sample including almost all medical specialties in university, tertiary and regional hospitals, results presented here, being a probably good estimation of this environment, should be interpreted with caution (the sample is not statistically representative).


Through the validation process described above, there are grounds for believing that the Greek version of PHEEM measures what it is intended to measure, i.e., the education environment of the Greek hospitals, and that it is useful for its intended purpose. Even if the illustrated convergent validity (assessed by the single item of “global satisfaction”) may rather summarize all items of the PHEEM measure, the PHEEM has the ability to reveal where exactly the problem lies (e.g. “no suitable access to career advice” or “lack of regular feedback from seniors” etc.) and how big it is. Thus, we recommend to use the Greek version of PHEEM to monitor the educational environment and quality of hospital training in Greece and to assess and follow up on the effectiveness of potential corrective measures. A meeting of constructors, translators, validators and users could agree in a new version (v.1) of PHEEM.


  1. 1.

    Boor K, Scheele F, van der Vleuten CP, Teunissen PW, den Breejen EM, Scherpbier AJ. How undergraduate clinical learning climates differ: a multi-method case study. Med Educ. 2008;42(10):1029–36.

    Article  PubMed  Google Scholar 

  2. 2.

    Clapham M, Wall D, Batchelor A. Educational environment in intensive care medicine–use of Postgraduate Hospital Educational Environment Measure (PHEEM). Med Teach. 2007;29(6):e184–91.

    Article  PubMed  Google Scholar 

  3. 3.

    Hoff TJ, Pohl H, Bartfield J. Creating a learning environment to produce competent residents: the roles of culture and context. Acad Med. 2004;79:532–9.

    Article  PubMed  Google Scholar 

  4. 4.

    Jain G, Mazhar MN, Uga A, Punwani M, Broquet KE. Systems-based aspects in the training of IMG or previously trained residents: comparison of psychiatry residency training in the United States, Canada, the United Kingdom, India, and Nigeria. Acad Psychiatry. 2012;36(4):307–15.

    Article  PubMed  Google Scholar 

  5. 5.

    Wadensten B, Wenneberg S, Silen M, Tang PF, Ahlstrom G. A cross-cultural comparison of nurses’ ethical concerns. Nurs Ethics. 2008;15(6):745–60.

    Article  PubMed  Google Scholar 

  6. 6.

    Argyris C. On organizational learning. 2nd ed. Oxford: Blackwell Business; 1999.

    Google Scholar 

  7. 7.

    Nagraj S, Wall D, Jones E. Can STEEM be used to measure the educational environment within the operating theatre for undergraduate medical students? Med Teach. 2006;28(7):642–7.

    Article  PubMed  Google Scholar 

  8. 8.

    Roff S. The Dundee Ready Educational Environment Measure (DREEM): a generic instrument for measuring students’ perceptions of undergraduate health professions curricula. Med Teach. 2005;27(4):322–5.

    Article  PubMed  Google Scholar 

  9. 9.

    Holt MC, Roff S. Development and validation of the Anaesthetic Theatre Educational Environment Measure (ATEEM). Med Teach. 2004;26(6):553–8.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Cassar K. Development of an instrument to measure the surgical operating theatre learning environment as perceived by basic surgical trainees. Med Teach. 2004;26(3):260–4.

    Article  PubMed  Google Scholar 

  11. 11.

    Riquelme A, Padilla O, Herrera C, Olivos T, Roman JA, Sarfatis A, Solis N, Pizarro M, Torres P, Roff S. Development of ACLEEM questionnaire, an instrument measuring residents’ educational environment in postgraduate ambulatory setting. Med Teach. 2013;35:e861–6.

    Article  PubMed  Google Scholar 

  12. 12.

    Aspegren K, Bastholt L, Bested KM, Bonnesen T, Ejlersen E, Fog I, et al. Validation of the PHEEM instrument in a Danish hospital setting. Med Teach. 2007;29(5):498–500.

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Auret KA, Skinner L, Sinclair C, Evans SF. Formal assessment of the educational environment experienced by interns placed in rural hospitals in Western Australia. Rural Remote Health. 2013;13(4):2549 (Epub 2013 Oct 20).

    PubMed  Google Scholar 

  14. 14.

    Boor K, Scheele F, van der Vleuten CP, Scherpbier AJ, Teunissen PW, Sijtsma K. Psychometric properties of an instrument to measure the clinical learning environment. Med Educ. 2007;41(1):92–9.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Gooneratne IK, Munasinghe SR, Siriwardena C, Olupeliyawa AM, Karunathilake I. Assessment of psychometric properties of a modified PHEEM Questionnaire. Ann Acad Med Singapore. 2008;37(12):993–7.

    CAS  PubMed  Google Scholar 

  16. 16.

    Gough J, Bullen M, Donath S. PHEEM ‘downunder’. Med Teach. 2010;32(2):161–3.

    Article  PubMed  Google Scholar 

  17. 17.

    Khan JS. Evaluation of the educational environment of postgraduate surgical teaching. J Ayub Med Coll Abbottabad. 2008;20(3):104–7.

    Google Scholar 

  18. 18.

    Nishigori H, Nishigori M, Yoshimura H. DREEM, PHEEM, ATEEM and STEEM in Japanese. Med Teach. 2009;31(6):560.

    Article  PubMed  Google Scholar 

  19. 19.

    Pinnock R, Reed P, Wright M. The learning environment of paediatric trainees in New Zealand. J Paediatr Child Health. 2009;45(9):529–34.

    Article  PubMed  Google Scholar 

  20. 20.

    Riquelme A, Herrera C, Aranis C, Oporto J, Padilla O. Psychometric analyses and internal consistency of the PHEEM questionnaire to measure the clinical learning environment in the clerkship of a Medical School in Chile. Med Teach. 2009;31(6):e221–5.

    Article  PubMed  Google Scholar 

  21. 21.

    Schonrock-Adema J, Heijne-Penninga M, Van HE, Cohen-Schotanus J. Necessary steps in factor analysis: enhancing validation studies of educational instruments. The PHEEM applied to clerks as an example. Med Teach. 2009;31(6):e226–32.

    Article  PubMed  Google Scholar 

  22. 22.

    Taguchi N, Ogawa T, Sasahara H. Japanese dental trainees’ perceptions of educational environment in postgraduate training. Med Teach. 2008;30(7):e189–93.

    Article  PubMed  Google Scholar 

  23. 23.

    Vieira JE. The postgraduate hospital educational environment measure (PHEEM) questionnaire identifies quality of instruction as a key factor predicting academic achievement. Clinics (Sao Paulo). 2008;63(6):741–6.

    Article  Google Scholar 

  24. 24.

    Wall D, Clapham M, Riquelme A, Vieira J, Cartmill R, Aspegren K, Roff S. Is PHEEM a multi-dimensional instrument? An international perspective. Med Teach. 2009;31(11):e521–7.

    Article  PubMed  Google Scholar 

  25. 25.

    Rammos A, Tatsi K, Bellos S, Dimoliatis IDK. Translation into Greek of the postgraduate hospital educational environment measure (PHEEM). Arch Hellen Med. 2011;28:48–56.

    Google Scholar 

  26. 26.

    Roff S, McAleer S, Skinner A. Development and validation of an instrument to measure the postgraduate clinical learning and teaching educational environment for hospital-based junior doctors in the UK. Med Teach. 2005;27(4):326–31.

    CAS  Article  PubMed  Google Scholar 

  27. 27.

    Streiner DL, Norman GR. Health measurement scales: A practical guide to their development and use. 4th ed. Oxford: Oxford University Press; 2008.

    Book  Google Scholar 

  28. 28.

    Whittle SR, Whelan B, Murdoch-Eaton DG. DREEM and beyond: studies of the educational environment as a means for its enhancement. Educ Health. 2007;20:7.

  29. 29.

    Fayers PM, Machin D. Quality of life: assessment, analysis and interpretation. Chichester: Willey; 2000.

    Book  Google Scholar 

  30. 30.

    Dimoliatis IDK, Jelastopulu E. Surgical theatre (operating room) measure STEEM (OREEM) scoring overestimates educational environment: the 1-to-L Bias. Univ J Educ Res. 2013;1(3):247–54.

    Google Scholar 

  31. 31.

    Cronbach LJ, Warrington WG. Time-limit tests: estimating their reliability and degree of speeding. Psychometrika. 1951;16(2):167–88.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Jöreskog KG. New developments in LISREL: analysis of ordinal variables using polychoric correlations and weighted least squares. Qual Quant. 1990;24:387–404.

    Article  Google Scholar 

  33. 33.

    Muthen B. A general structural equation model with dichotomous, ordered, categorical, and continuous latent variables indicators. Psychometrika. 1984;49:115–32.

    Article  Google Scholar 

  34. 34.

    Jöreskog KG, Sörbom D. LISREL 8. Chicago, IL: Scientific Software International Inc; 1996.

    Google Scholar 

  35. 35.

    Flokstra-de Blok BM, et al. Development and validation of a self-administered Food Allergy Quality of Life Questionnaire for children. Clin Exp Allergy. 2009;39(1):127–37.

    CAS  Article  PubMed  Google Scholar 

  36. 36.

    Jaeschke R, Singer J, Guyatt GH. Measurement of health status. A scertaining the minimal clinically important difference. Control Clin Trials. 1989;10(4):407–15.

    CAS  Article  PubMed  Google Scholar 

  37. 37.

    Juniper EF, Guyatt GH, Willan A, Griffith LE. Determining a minimal important change in a disease-specific Quality of Life Questionnaire. J Clin Epidemiol. 1994;47(1):81–7.

    CAS  Article  PubMed  Google Scholar 

  38. 38.

    Tavakol M, Dennick R. Making sense of Cronbach’s alpha. Int J Med Educ. 2011;2:53–55.

  39. 39.

    Harden RM, Crosby J. AMEE Guide No 20: the good teacher is more than a lecturer—the twelve roles of the teacher. 1st ed. Dundee: Association for Medical Education in Europe (AMEE); 2000.

    Google Scholar 

  40. 40.

    Karakitsiou DE, Markou A, Kyriakou P, Pieri M, Abuaita M, Bourousis E, et al. The good student is more than a listener: the 12 + 1 roles of the medical student. Med Teach. 2012;34(1):e1–8.

    CAS  Article  PubMed  Google Scholar 

Download references

Authors’ contributions

ID and EJ conceived of the study, supervised the entire work and co-wrote the manuscript. PK collected all data in paper form and drafted the manuscript. VK and SB devised the online questionnaire. DM and ID performed the statistical analysis. All authors contributed to the interpretation process. All authors read and approved the final manuscript.


The authors wish to acknowledge the contribution of the residents; without their effort and support this study would not have been possible.

Competing interests

The authors declare that they have no competing interests.

Author information



Corresponding author

Correspondence to Eleni Jelastopulu.



Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Koutsogiannou, P., Dimoliatis, I.D.K., Mavridis, D. et al. Validation of the Postgraduate Hospital Educational Environment Measure (PHEEM) in a sample of 731 Greek residents. BMC Res Notes 8, 734 (2015).

Download citation


  • Validation
  • Greece
  • Hospital
  • Education environment
  • Measure
  • Residents