One of the main aims of my PhD research project is to grasp how Hungarian medium-sized and large companies define their societal role and their social responsibilities. I did not want to use survey or interview data, but rather I attempted to map how these themes appear in corporate discourse. Therefore, I analysed corporate homepages with two methods, namely, content and discourse analyses. These two methods complemented each other during the analysis and provided both quantitative and qualitative data. After a brief methodological overview, this case study presents the main differences of the two methodological approaches through three dimensions. The first dimension concerns the coding process, while the second and the third are related to the differences in the actual analyses of texts and the role the context plays in these analyses.
By the end of the case study, you should
- Understand the role of textual analysis in social sciences
- Have a better understanding of the methodological challenges of using the Internet as a source of textual analysis
- Have a better understanding of the methodological differences of content analyses and discourse analyses
- Be able to examine the pros and cons of content analyses and discourse analyses as a means to examine social phenomena
As a PhD student in sociology, I attempted to combine my job experience as a corporate social responsibility (CSR) consultant, my interest in social studies and my deep attachment to discourse analysis. Therefore, I have formulated the main research question of my PhD thesis as follows: What and how do Hungarian medium-sized and large companies communicate on their websites in terms of their role and responsibility in society?
One must be aware that corporate communication cannot be equated with the actual operation of these companies; by focusing on this communication, however, the researcher is able to draw conclusions about the companies' understanding of social responsibility, their priorities and about the role CSR plays in constructing corporate identity. Furthermore, when focusing on corporate communication, it is worth examining how companies attempt to legitimise their social role and to which social groups they refer as the reason and purpose of their existence (e.g. workers to whom they give job and customers whose needs they aim to satisfy).
Moreover, because companies make important value-statements implementing various social norms, values and regulations to their functioning, they are capable of influencing social discourses. This can be witnessed in cases in which corporate actors emphasise the importance of environmental quality or work–life balance affecting the thematisation of these issues. Consequently, analysing what and how they talk about their social role can help us to understand the nature of (Hungarian) companies, and hereby we might be able to mediate better between the expectations of economic actors and demands of society.
The research project which served as a basis for this case study was financed by the OTKA Research Fund, Project No.: K104707 Research.
The Main Methods
I utilised two methods to investigate company websites: content analysis and discourse analysis. These methods complement each other, on the one hand, and provide both quantitative and qualitative descriptions about the subject of the research, on the other.
Content analysis is an appropriate method for arranging texts into a unified and generalised coding structure which, in turn, can be easily transformed into a database. This database can provide statistically calculated and tested data; in other words, the researcher can analyse texts as quantitative data utilising statistical methods and tools (like statistical program SPSS). By applying quantitative content analysis, one can delineate trends, present frequencies and test differences between groups.
Discourse analysis, in contrast, is an interpretative method focusing on textual and structural features of texts and on how actors mobilise different topics and discursive structures. By analysing these discursive elements, one can reveal how various actors give different interpretations to a particular social phenomenon (like companies' societal role) and how this phenomenon is used as a specific discursive strategy (e.g. to legitimise the way a given company operates).
Both methods allow not only the direct analysis and interpretation of texts but also the social phenomena covered by them. This characteristic makes them suitable for investigating the publicly communicated image of companies, especially the social role and responsibility they present.
Although my research is based on web-content and discourse analyses, I did not create a random sample from corporate homepages but utilised a stratified selection approach from the whole population of Hungarian medium-sized and large companies. The population was the officially registered companies in operation with at least 150 employees in 2011. So, I have generated a stratified sample (10% of the population) utilising four variables: ownership (domestic–private, domestic–state and foreign), number of employees, revenues and sector of operation. After that, I narrowed my sample to only companies having a functioning webpage. This means that the final sample consists of 146 homepages.
One must be aware that using the texts of corporate websites (as it is the case with any other sources of data) raises ethical issues. In this analysis, I have used only publicly and openly available texts (companies' websites); therefore, I did not ask for permission from the companies themselves. Nevertheless, analysing these texts also means that the researcher lifts them out of their original context in which they are embedded. Subsequently, these texts are positioned in an altered context which allow for a comparison between them and, thereby, the interpretation of their content. This new interpretative framework can slightly change the emphases of these texts. To deal with this ethical dilemma, I never mentioned the names of the companies when I cited their texts, but I only indicated their sectorial belongings as a point of reference.
Methods Overview: Content Analysis and Discourse Analysis
It is beyond the scope of this case study to present all the methodological steps taken in this research project. Nonetheless, in the following sections, my aim is to highlight the essential methodological characteristics of my approach in relation to the numerous schools of content and discourse analyses.
Reading any of the books by the main proponents of content analysis (like Klaus Krippendorff's (2004) book Content Analysis: An Introduction to Its Methodology, which guides the reader through the methodological steps and decisions, or Kimberley Neuendorf's (2002) book The Content Analysis Guidebook, which operates with an integrative model of content analysis and discusses several different approaches and fields of application) highlights the fact that content analysis is applicable to various fields in social sciences with both quantitative and qualitative foci.
I applied quantitative web-content analysis to investigate themes, programmes and initiatives appearing on corporate homepages. To define content analysis, I drew on Krippendorff's definition which emphasises that content analysis is a research technique, which is (a) working with texts, and by analysing these (b) it draws conclusions not only about the texts themselves but also on their context (c) in a reliable and valid way. The different parts of this definition can be explained further:
- working with texts means the units of analysis are textual elements of differing lengths such as words, sentences or paragraphs (or in the most recent approaches units can be even audio–visual objects, for example, videos, pictures or icons).
- however, the research is not only aimed at the interpretation of the specific text's inner grammatical or semantic structure but also attempts to draw conclusions about the text's context (e.g. about its author).
- this movement between text and context, however, requires very thorough and consistent methodological design to guarantee the reliability and validity of the data as well as their interpretation.
When using the Internet as a text resource, one is faced with even more difficulties, that is, the problem related to the temporal characteristics of texts and the complexity of non-linear structure via hyperlinks embedded in texts.
Identifying the temporal characteristics of texts on the Internet is often a challenge for the researcher, sometimes it is even impossible. First, there are several texts without any reference of date and time. Second, there is a lot of accessible previous information, which means that one has to draw a line to select those texts which are the most relevant in terms of his or her research. These temporal characteristics of web-texts mean that a redesign of a corporate homepage could make all structure and content totally disappear, which compel the researcher to regularly archive analysed contents.
Another difficulty stemming from the characteristics of the content on the web is its non-linear structure. Hyperlinks, subpages and pop-ups create a new text-structure, which is a peculiarity of the Internet. This interconnectedness of texts significantly complicates the analysis because borders between texts are blurred, or sometimes, even the borders of a homepage must be arbitrarily drawn by the analyst.
Therefore, the researcher must take extra care to make it very transparent which texts of a given homepage are under investigation and which are excluded from the analysis.
There are many different schools in the field of discourse analysis. These different approaches to the method are discussed in a systematic manner in books like Discourse Theory and Practice by Margaret Wetherell, Stephanie Taylor and Simeon J Yates (2001), whose work is structured by six key areas, or Methods of Text and Discourse Analyses by Stephan Titscher, Michael Meyer, Ruth Wodak and Eva Vetter (2003 ), which discusses 12 different methodological approaches in detail. In my research project, I heavily drew on the approach of critical discourse analysis (CDA) in general and utilised many of the features described in Ruth Wodak's discourse-historical approach in particular.
These theoretical frameworks emphasise the importance of the texts' wider context. They stress the point that the researcher has to go beyond linguistic considerations and include the additional textual information and even the social context in the analysis of texts. Furthermore, they underline that the relationship between texts and their social context is not one-sided but a complex and interconnected system. On one hand, social norms and rules frame the structure and content of texts. On the other hand, texts could also affect the same social values, norms and processes. This interconnectedness of texts with their contexts and their active strategic usage are the determining features of discourses. So, we could define discourse as a given set of texts whose relations to each other and usage are controlled by social norms and rules, yet these texts could also affect these very norms and rules themselves. While texts are static in themselves, discourses are active processes through which actors (consciously or unconsciously) employ strategies to give meaning to and to affect social phenomena, values, norms and rules. In this sense, analyses of discourses are not the examination of texts in themselves, but how these texts are utilised and what strategies are related to them.
My discourse analytical method is partly based on three main characteristics of Ruth Wodak's discourse-historical approach: intertextuality, triangulation and ethnographic methods. Intertextuality means that researchers have to examine not only the texts under scrutiny, but also the interconnection of different styles, genres, themes both in the texts and between them. Thus, the interpretation of texts always implies the analysis of their textual and social contexts. Triangulation here refers to the combination of various methodological and data-collecting approaches used to analyse a given discursive phenomenon. This contributes to the unravelling of the complex relationship between discourse(s) and social structure. Using ethnographic methods helps the researcher to pay attention to the cultural and social milieu of texts in the course of the analysis.
This brief overview discussed some of the major characteristics of content and discourse analyses. However, to understand their strengths and weaknesses, we need to see them in action, in other words, how and for what purpose we can use them to investigate social phenomena. Therefore, in the following section, I address these issues by focusing on three domains of my research project.
Methods in Action at Three Domains
The preceding methodological overview might depict the main issues and difficulties of the two methods individually, but to elucidate the real differences between them, one needs to see how they are applied in action. Therefore, in the following part, I would like to guide the reader through three domains where the most striking differences lie. These are as follows: research questions and coding schemes, content versus discourse of texts and the role and range of contexts.
Research Questions and Coding Schemes
One dimension along which content and discourse analyse differ is their distinctive ways of research question formulation and, consequently, the approach to and the structure of coding.
In the case of quantitative content analysis, the researcher works with well- and pre-defined research questions, which might be formulated even in the form of hypotheses. In line with this, the aim of the analysis is to reveal statistical descriptions, like frequencies, and relationships like significant differences between topics and so on. Accordingly, before the analysis, the researcher has to identify all relevant categories, topics and themes. This pre-defined focus determines which content will be searched and registered in the texts; thus, it also serves as a basis for the elaboration of the coding scheme. Because I attempted to identify particular themes and topics in corporate homepages, I utilised binary – in other words, yes/no – coding to represent whether information on a particular topic could be accessed on a given page or not.
The steps were as follows:
In the beginning, I delineated the two main research topics:
- identifying CSR content on corporate homepages
- analysing significant differences of content between groups by given company features (e.g. number of employees, revenue and sector)
Subsequently, I elaborated the coding scheme by drawing on several prior research projects and the relevant literature. In this fashion, I defined five main fields of investigation (CSR themes, corporate programmes and policies, work–life balance initiatives, stakeholder-groups mentioned and some smaller issues).
Consequently, the coding scheme contained 58 binary (yes/no) questions regarding whether a particular theme/programme/initiative appears on the homepage (code 1) or not (code 0).
By recording these codes in relation to every corporate homepage in the sample, I could set up a detailed and extensive database. In this fashion, I was able to utilise a statistical program (SPSS) to analyse texts as quantitative data in a reliable way.
To sum up the aforementioned arguments, quantitative content analysis means precisely defined research questions, a coding scheme defined before data collection and the statistical analysis of the quantitative data gathered.
Discourse analysis usually works with general research questions aiming to grasp implicit meanings and the very nature and characteristics of the social phenomenon analysed. For example, I attempted to explore how Hungarian companies define their societal role, what kinds of goals and functions they define for themselves and whose demands they are trying to meet. Accordingly, I did not have well-defined and clearly specified research questions or a pre-fixed coding scheme preceding text analysis. The ‘codes’, in a certain sense, emerged in the course of the analysis, and with each text analysed, the list of themes was getting longer and longer.
In the case of my research project, this ‘openness’ meant that when I analysed my first text, I started with a blank page; in other words, I only had an empty ‘company-role’ list on which I aimed to record my findings. After the first text, I had a relatively narrow ‘code-list’ which I used to analyse the second text. If I found a company role or goal which was not on the list yet, then I added that item to the others. So at the end of the 30th text analysed, my code-list contained almost 40 items.
In the case of discourse analyses, coding actually represents the very process through which the coding system emerges and is being shaped from text to text. Thus, the code-list is not a strict and closed system of items, but a continuously widening and changing catalogue of interconnected items. However, the researcher needs some constraints, even if these are self-imposed, otherwise the process and the number of codes will become part of a never-ending story.
There are different possibilities to narrow down the scope of the analysis:
- predetermine a number of texts (here homepages), which will be used to prepare the code-list. For example, the code-list can be fixed after the first 50 texts, and this list will serve as a coding scheme for the rest of the analysis.
- reduce the number of codes by merging those that are similar to each other or close in meaning.
- exclude the least frequent company roles (e.g. I ruled out those company roles and goals which appeared in the texts less than 10 times).
Nonetheless, whatever limit we apply, the code-list cannot be seen as final until the end of the analysis. It is always possible that a new company role appears, which is too important (or frequent) to be ignored. This continuous movement between texts and codes (and the utilised theoretical background) is called ‘iteration’.
Although qualitative discourse analysis is not working with prescribed coding schemes, the researcher can apply professional software to help his or her research. I utilised NVivo to deal with large amount of texts, to arrange the company goals, roles and other features mentioned on homepages. NVivo and other software can help make the texts and codes manageable, and the researcher might want to even calculate some statistics with it. However, it is worth mentioning that the aim of qualitative discourse analysis is not to turn texts into quantitative data or to test whether differences between groups are significant. With discourse analysis, one would like to deepen the understanding of a given phenomenon (in this case, the societal role of companies) and grasp what kinds of discursive strategies are utilised by actors in their talks or texts (in this case, on corporate homepages).
As these examples highlight, in the case of qualitative discourse analysis, researchers do not formulate detailed research questions or pre-defined coding schemes. The very nature of the analysis is that it allows the main focuses and topics to emerge in the process. In this fashion, researchers have a better chance to decipher more ‘content’ from texts than what is expected before the analysis.
Consequently, quantitative content analysis and qualitative discourse analysis deal with texts in very different ways.
Text: Content Versus Discourse
By utilising different methods, the same sentence or paragraph can take on different meanings and play different roles in the text. The following example demonstrates this effect:
By continuously improving our quality- and environmental management system we would like to keep our good reputation and keep our customers as well as to enter into new markets.
In the case of content analysis, the researcher checks whether there is enough information about certain topics, programmes or initiatives on the code-list, irrespective of the texts' length or style. As for this excerpt, I might record the code ‘1’ (yes) to the questions ‘Does the company have quality assurance or management system?’ and ‘Does the company have environment program or management system?’ Moreover, from the list of mentioned stakeholder-groups, I might record code ‘1’ to ‘customers’ and ‘environment’.
Conversely, in the case of discourse analysis, the researcher might investigate the societal role or company goal mentioned explicitly or implicitly, the groups referred to or whose interest the company has in view. In this example, ‘keeping good reputation’ and ‘keeping customers’ and ‘entering new markets’ were the goals mentioned, and, in line with these, customers and markets are the two main reference-groups. At the same time, I can identify implicit company goals, like ‘initiation of environmentally friendly operation’ and ‘improvement and innovation’, even if these are referred to as tasks to achieve the aforementioned company goals. Nonetheless, these second-order goals imply that environmental quality as a value might affect corporate management and operation.
The researcher, therefore, collects data (codes at content analysis and list of goals, tasks and other features of discourse analysis) by following these coding and interpreting steps again and again, from sentence to sentence and from text to text.
However, there are more differences between these methods in terms of how they process texts. For example, they also differ in which parts of the homepage texts researchers tend to analyse them, as the next section of this case study demonstrates.
Differences of the Analysed Sections of Company Homepages
The main goal of content analysis in my research project was to reveal themes, programmes and initiatives characterising Hungarian medium-sized and large companies in connection with their social responsibility and societal role. Therefore, as the first step, I attempted to analyse complete homepages (all the information which can be found on them). However, due to the complexity and interconnectedness of web-texts, as mentioned above, I had to impose some constraints on the analysis to deal with the seemingly endless nature of corporate texts, regarding the date, the content and the types of text.
It is important to mention that I started to analyse texts with content analysis. When, subsequently, I started the qualitative discourse analysis phase of my research project, I attempted to use the same corpus (same texts) for this part of the research with the aim of conducting a meticulous, in-depth analysis. However, the amount of data gathered and, consequently, the items on my list soon became unmanageable. After the first 13 companies analysed, I had to realise with considerable frustration that my list consisted of almost 600 codes regarding goals, roles, styles, homepage levels and so on.
I had to admit to myself that with this type of complex, iterative, qualitative discourse analysis, I could not examine the same amount of text as I could with quantitative content analysis. Therefore, I narrowed the scope of texts to be analysed. So, I only included the ‘about us’ and the ‘vision, mission, values’ sections of pages and I only considered page texts (i.e. downloadable contents were excluded). These were identifiable on almost every homepage, and their style and genre were almost the same or very similar in each of the cases. As a possibility for future research, it would be interesting to analyse different subpages which have a thematic focus such as quality and/or environment management subpages or website sections titled ‘carriers’. For the time being, I ruled them out because only a few companies have such sections. In a similar fashion, I also excluded CSR subpages – despite their thematic focus – for two reasons. First, only one-fourth of the companies own a CSR page. Second, these texts strongly resemble their CSR report which is a totally different genre, especially in the case of reports based on international standards. Nonetheless, it also would be interesting and relevant to analyse these subpages and compare the results with the analysis concerning the ‘about us’ sections.
In the following section, I present the third essential difference between the two methods, that is, the role of context and the range of mobilised knowledge beyond the texts analysed.
The Role of Context and Range of Mobilised Knowledge
In social sciences, one of the aims of utilising textual analysis concerning the content or the discursive features of texts is to draw conclusions about the authors or social context. In the case of this research, this means that by analysing corporate homepage communication, I attempted to grasp the actual characteristics and features of companies in terms of their social responsibility and societal role. However, the two methods are different in their approach to the range of contextual information involved in the analysis and in its interpretation. In the following sections, I reflect on these differences.
By applying numerical codes for quantitative content analysis, I attempted to compute frequencies and to find the tendencies characterising the authors of the texts analysed, namely, the Hungarian medium-sized and large companies. After I had examined which CSR topics occur on their homepages, I statistically tested differences between groups created according to external (meaning contextual variables irrespective of the text) dimensions. As can be seen, the information involved in the context here is not textual (like other relevant CSR texts or company documents); rather, it is linked to other company-related characteristics regarding the ‘real world’, such as the companies' sector of operation or revenues. Therefore, the only role the context plays in this case is introducing non-textual grouping variables in order to facilitate conducting statistical tests. I tested, for example, whether the difference between medium-sized and large or between foreign- and domestic-owned companies are significant (see Tables 1 and 2).
Table 1. Occurrence of employment-topic by ownership (%).
Table 2. Occurrence of part-time job by number of employees (%).
As far as qualitative discourse analysis is concerned, especially in my case, the contextual information involved in the analysis is much more complex; consequently, the role the context plays is more important.
Company characteristics can also be taken into account in discourse analysis to highlight the differences of discursive strategies between different types of companies. However, the range of mobilised knowledge is much broader in this type of analysis, and it might contain both textual and non-textual information. For example, to interpret Hungarian companies' discourse concerning their societal role and corporate goals, I considered several pieces of contextual information such as the following:
- data on the actual CSR activity of Hungarian companies (such as survey data on Hungarian companies' CSR initiatives)
- information about Hungarian economy (tendencies, sectorial distribution, etc.)
- literature on Hungary being a ‘transfer-country’, which refers, on one hand, to the country's geographical location between Europe and Asia, and to the in-between state in the process of transformation from socialism to democracy and capitalism, on the other. These interrelated characteristics have significant economic implications; thus, they have to be taken into account if one wants to understand corporate identities and social roles in Hungary
- the legal context of CSR (e.g. international and national laws, regulations or initiatives, to which the texts under scrutiny might refer)
- the dominant theoretical schools of CSR, as a point of reference (e.g. implicit references to two of the main approaches to CSR are often easy to identify in homepages: (a) one which states that the only societal role of companies is profit maximisation (Friedman) and (b) one which stresses that companies' social responsibility means that they have to contribute to economic, environmental and social sustainability at the same time (the so called triple-bottom lines concept))
These examples show that in the case of qualitative discourse analysis, the range of knowledge mobilised in the interpretation is much wider than in content analysis.
For instance, utterances such as ‘our goal is to implement foreign experience here in Hungary’ or ‘this is a 100% domestically owned and governed company’ could be analysed in themselves without taking into account the external contexts of the text. It is obvious that they refer to the topics of knowledge-transfer and the importance of domestic ownership. But if we also consider the literature on Hungary implying that it is a ‘transfer-country’, the meaning of these sentences gains depth and complexity. It becomes obvious that they contain reference to issues such as the importance or even the superiority of foreign (especially West European) knowledge and technology in the Hungarian economy, or, just the opposite, special emphasis is put on exclusive domestic ownership as compared to business rivals or to actors in other sectors. (I excluded the interrelationship of political and corporate discourse from my analysis.)
To understand what companies define as their societal role and what they thematise as part of their CSR, I examined the texts on their homepages. I utilised two methodological approaches, which complement each other, to investigate these questions. These two methods were content and discourse analyses. The aim of my content analysis was to quantitatively map the different themes based on a pre-fixed coding scheme, while the discourse analysis phase represented a more qualitative and interpretative process.
Apart from the difference in the coding process, the two methods also differ in the manner they deal with the texts under scrutiny. Furthermore, during the analyses both methods mobilise external information about the context and the social environment in order to fully understand the meanings conveyed by textual items. However, they do this ‘mobilisation’ in a different manner. In this case study, by focusing on these methodological differences, I aimed to point to the complementary nature of these methods and to demonstrate how they can be utilised for textual analysis in social science research.
To sum up, in a study utilising texts from the web, a researcher might use content analysis if he or she aims to analyse a large number of cases and to get quantifiable data from textual units. While this method can provide a general picture of certain phenomena, however, it does not provide insight into deeper levels of texts. Furthermore, content analysis does not focus on the texts' interconnectedness with broader textual and social context which are important aspects of meaning-making and legitimisation processes. To grasp this role of texts in a discursive process, qualitative discourse analysis might be a more relevant approach.
Exercises and Discussion Questions
- Think through (and gather) the pros and cons of the two methodologies in relation to the following issues:
- the coding process
- the difficulty and complexity of the analysis
- the interpretative flexibility of results
- Consider what can guarantee the methodological reliability (in other words, if someone conducts the same research again would have the same results) in each of these methods.
Analyse the following example in the social, economic and political context of your country.
Our business is 100% domestically owned and provides employment and livelihood for 430 people. […] Our standards set by high quality criteria can be met with state of the art machines which are the outcome of a modernization process conducted in recent years. Apart from efficient operation, it is equally important to provide high quality products and meet the demands of our customers day by day. Thereby, we combined efficiency, due to European production equipment and production lines, and high quality pastry production based on traditional recipes and technologies. (a Hungarian Mill)
- What kinds of areas of your knowledge have you mobilised to interpret the excerpt (e.g. political system, economic processes, mass media, advertisements, knowledge from your studies and recent news)?
- What does the company emphasise in this text about itself and its societal role?
- What is important for the company in terms of its operation (whose demands it wants to meet)?
- What importance do these issues have in your country and in your economy?
- If it is possible, compare your answers with those of students' from different countries.