Lecture 6:Moderation: the relationship between x and y depends on a third variable z, z is than the moderation. In a regression multiplicative interaction term between the predictor x and the moderator z. For making hypothesis: effects become stronger, weaker or change direction.Moderation: statistical approach:Regress depend variable on:Predictor (x)Moderator (z)Interaction of the two (xz)Tips: Include interaction terms if hypothesis is conditional:Conditional hypothesis:A relationship between two or more variables depends on the value of one or more other variablesAn increase in x is associated with an increase in y when condition z is met, but not when condition z is absentA positive effect of x on y gets stronger as z increasesConditional hypotheses can be tested using interaction termsInclude all constitutive terms: include each of the elements that constitute interaction term:If you include xz, always include x and yIf you include xzj, always include x, z, j, xz, zj and xj If you include X^2, always include xInterpret correctly:Interpretation of B is now different, so B2 does no longer explains the increase of x on y (Only when z is zero, it does) to see this you should calculate the marginal effect of x on y for different values of zMeaningful MEs and SEs:Correct interpretation of B’s involve derivation of marginal effects of the independent variables and the uncertainty with which they are estimatedWithout interaction With interaction:Traditional results tables only report Bs and SEs, but we can only tell whether x has an significant effect on y when z is 0, however this could also be on other variables. If z is binary present four number:Marginal effect of x when z is 0 and z is 1The corresponding standard errorsIf z is continuous a graphical analysis is required:Plot the marginal effect of xAcross a substantively meaningful range of zAnd add the confidence interval for assessing significance...


Access options

      How do you get full online access and services on JoHo WorldSupporter.org?

      1 - Go to www JoHo.org, and join JoHo WorldSupporter by choosing a membership + online access
       
      2 - Return to WorldSupporter.org and create an account with the same email address
       
      3 - State your JoHo WorldSupporter Membership during the creation of your account, and you can start using the services
      • You have online access to all free + all exclusive summaries and study notes on WorldSupporter.org and JoHo.org
      • You can use all services on JoHo WorldSupporter.org (EN/NL)
      • You can make use of the tools for work abroad, long journeys, voluntary work, internships and study abroad on JoHo.org (Dutch service)
      Already an account?
      • If you already have a WorldSupporter account than you can change your account status from 'I am not a JoHo WorldSupporter Member' into 'I am a JoHo WorldSupporter Member with full online access
      • Please note: here too you must have used the same email address.
      Are you having trouble logging in or are you having problems logging in?

      Toegangsopties (NL)

      Hoe krijg je volledige toegang en online services op JoHo WorldSupporter.org?

      1 - Ga naar www JoHo.org, en sluit je aan bij JoHo WorldSupporter door een membership met online toegang te kiezen
      2 - Ga terug naar WorldSupporter.org, en maak een account aan met hetzelfde e-mailadres
      3 - Geef bij het account aanmaken je JoHo WorldSupporter membership aan, en je kunt je services direct gebruiken
      • Je hebt nu online toegang tot alle gratis en alle exclusieve samenvattingen en studiehulp op WorldSupporter.org en JoHo.org
      • Je kunt gebruik maken van alle diensten op JoHo WorldSupporter.org (EN/NL)
      • Op JoHo.org kun je gebruik maken van de tools voor werken in het buitenland, verre reizen, vrijwilligerswerk, stages en studeren in het buitenland
      Heb je al een WorldSupporter account?
      • Wanneer je al eerder een WorldSupporter account hebt aangemaakt dan kan je, nadat je bent aangesloten bij JoHo via je 'membership + online access ook je status op WorldSupporter.org aanpassen
      • Je kunt je status aanpassen van 'I am not a JoHo WorldSupporter Member' naar 'I am a JoHo WorldSupporter Member with 'full online access'.
      • Let op: ook hier moet je dan wel hetzelfde email adres gebruikt hebben
      Kom je er niet helemaal uit of heb je problemen met inloggen?

      Join JoHo WorldSupporter!

      What can you choose from?

      JoHo WorldSupporter membership (= from €5 per calendar year):
      • To support the JoHo WorldSupporter and Smokey projects and to contribute to all activities in the field of international cooperation and talent development
      • To use the basic features of JoHo WorldSupporter.org
      JoHo WorldSupporter membership + online access (= from €10 per calendar year):
      • To support the JoHo WorldSupporter and Smokey projects and to contribute to all activities in the field of international cooperation and talent development
      • To use full services on JoHo WorldSupporter.org (EN/NL)
      • For access to the online book summaries and study notes on JoHo.org and Worldsupporter.org
      • To make use of the tools for work abroad, long journeys, voluntary work, internships and study abroad on JoHo.org (NL service)

      Sluit je aan bij JoHo WorldSupporter!  (NL)

      Waar kan je uit kiezen?

      JoHo membership zonder extra services (donateurschap) = €5 per kalenderjaar
      • Voor steun aan de JoHo WorldSupporter en Smokey projecten en een bijdrage aan alle activiteiten op het gebied van internationale samenwerking en talentontwikkeling
      • Voor gebruik van de basisfuncties van JoHo WorldSupporter.org
      • Voor het gebruik van de kortingen en voordelen bij partners
      • Voor gebruik van de voordelen bij verzekeringen en reisverzekeringen zonder assurantiebelasting
      JoHo membership met extra services (abonnee services):  Online toegang Only= €10 per kalenderjaar
      • Voor volledige online toegang en gebruik van alle online boeksamenvattingen en studietools op WorldSupporter.org en JoHo.org
      • voor online toegang tot de tools en services voor werk in het buitenland, lange reizen, vrijwilligerswerk, stages en studie in het buitenland
      • voor online toegang tot de tools en services voor emigratie of lang verblijf in het buitenland
      • voor online toegang tot de tools en services voor competentieverbetering en kwaliteitenonderzoek
      • Voor extra steun aan JoHo, WorldSupporter en Smokey projecten

      Meld je aan, wordt donateur en maak gebruik van de services

      Join World Supporter
      Join World Supporter
      Log in or create your free account

      Why create an account?

      • Your WorldSupporter account gives you access to all functionalities of the platform
      • Once you are logged in, you can:
        • Save pages to your favorites
        • Give feedback or share contributions
        • participate in discussions
        • share your own contributions through the 7 WorldSupporter tools
      Follow the author: alinehooiveld@gmail.com
      Promotions
      vacatures

      JoHo kan jouw hulp goed gebruiken! Check hier de diverse bijbanen die aansluiten bij je studie, je competenties verbeteren, je cv versterken en je een bijdrage laten leveren aan een mooiere wereld

      verzekering studeren in het buitenland

      Ga jij binnenkort studeren in het buitenland?
      Regel je zorg- en reisverzekering via JoHo!

      Access level of this page
      • Public
      • WorldSupporters only
      • JoHo members
      • Private
      Statistics
      [totalcount]
      Comments, Compliments & Kudos

      Add new contribution

      CAPTCHA
      This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
      Image CAPTCHA
      Enter the characters shown in the image.
      WorldSupporter Resources
      Summary lecture 1, Emperical research project for IB

      Summary lecture 1, Emperical research project for IB

      Lecture 1:

      The research question consists of:

      • Starts with the research question
      • Theory/literature
      • Hypothesis: testable prediction
      • Then we can define the main variables (dependent and independent variable)
      • Then you need to collect data (measurement)
      • Analyze data (graphically/descriptively)
      • Fit a model
      • Conclusion

      Measurment: a relationship between the numbers and what is being measured. You can measure variables in different kind of ways. Important to consider:

      1. What do you really want to measure
      2. What is your research question

      Basic issues in measurement:

      1. Validity: extent to which a measure correctly represents the concept of a study (refers to the study not a specific variable
      • Internal validity:  how well the study was done
      • External validity: generalize results to other situations
      1. Accuracy: is the measure close to the actual value and did you get the right answer on average?
      2. Reliability: extent to which a variable is consistent in what it is intended to measure

      It is important to measure the right thing and be clear about what you measure

      Organize your data:

      • Cross sectional: observations at a given points or time
      • Time series
      • Panel: both cross sectional and time-series dimensions (over a period of time)

      Article Hult et al:

      • Focus on why do some firms outperform others
      • Performance is important variable (often DV)
      • Inconclusive results about determinants of performance
      • Conclusions depends on the measurement of performance
      • No systematic investigation as to how IB research measures performance (contribution)
      • They examine the measurement of performance
      • They do that in 96 articles published in the journal between 1995 and 2005
      • More specifically: they asses the measurement of performance in 3 dimenstions:
      1. Type of data source
      2. Type of measure
      3. Level of analysis
      • What did they find: Most studies do not measure performance in a manner that captures the multifacted nature of the construct
      • We describe the implications of these results and offer suggestions for improving future practice (present non binding guidelines)-what they do with their findings

      Questions the researcher has to deal with:

      • What do you want to measure
      • What kind of data to use:
      1. Primary data: collected by researcher à time consuming, but original
      2. Secondary dataL collected by other agencies; cheap, but lacks originality and may not be fitting to the research question
      • How should we measure performance, because you can measure the same thing in different ways
      1. Financial performance: reflects economic goals
      2. Operational performance: non financial, like innovation, productivity and satisfaction
      3. Overall effectiveness: e.g. reputation (related to both)
      • Which level of analysis to focus on (remember external validity):
      1. Firm
      2. Strategic business unit (SBU)
      3. Inter-organizational unit (cannot be found in Hult et al)
      4. In general at very different levels, country, region, industry, firm etc.

      Why do all they matter: potentially different results and conclusions. The importance of the measurement

      But: Is one right and the other wrong. Should we all agree on a single measure to use? So which one to choose? It depends on the research question

      Tip:

      • Be clear about what you want to do
      • Choose the appropriate measure for your analysis
      • Justify your decision
      • Be clear about you limitations

      Selection bias:

      • Be careful about the interpretation of your result and be careful about what conclusion you draw.
      • Can compare with another sample

      Endogeneity:

      • Correlation between regressor and error term
      • Reasons: measurement error, omitted variable (any variable that is not included as dependent variable but could influence the dependent variable) and reverse causality (causality that is not really causality)
      • Standard OLS estimate biased
      • There are solutions
      Summary lecture 2, Emperical research project for IB

      Summary lecture 2, Emperical research project for IB

      Lecture 2:

      Measurement levels:

      • Nominal:  number only tells the category and there is no ranking (no logical order) there is also no difference between the values (Can be changed to binary and dummy variables)
      • Ordinal: Ordered categories, but no equal distance
      • Interval: information about differences between points on a scale, there are equal distances (also scale variable)
      • Ratio: same as interval, but with an absolute zero e.g. weight

      Not all calculations can be performed on all data

      Descriptive/summary statistics:

      • Step 1
      • Quantitative description of main features of data
      • Just a summary
      • Before actual analysis
      • Tells us: which are the players of the game, which is the nature of the variable (discrete or continuous) and do you see any problems or need to keep things in mind (like min, max and negative values)

      Summary statistics:

      1. Number of observations (choose minimum)
      2. Measures of central tendency
      • Mean: influenced by extreme observations (use for interval of ratio à depends on skewness)
      • Median: middle point when values are ranked in order of magnitude. If the sample size is an even number, the median id average of the two middle observations. Relatively unaffected by extreme scores (use for ordinal always and for interval or ratio depends on skewness)
      • Mode: Most frequent value, but there could be more than one mode (use for nominal variable)
      1. Skewness says something about the shape of the distributions and the deviation from normal, the skew is 0. So it is compared to normal. If the value is positive than it is positively skewed, otherwise negatively. If the value is outside {-1,1} range than it is a substantially skewed distribution. Not skewed use mean otherwise use median
      2. Kurtosis says something about the shape of the distribution and the deviation from normal. If it is normal than kurtosis is 3. Leptopkurtic: heavy tails (may also) pointy (>3) and platykurtic: light tail may also (flatter) (<3) in spss 0 is 3
      3. Minimum and maximumà can define the range (measure of variability, defined as maximum value-minimum value and is affected by extreme scores). Also interquartile range (measure of variability q1 (lower quartile (25%) and q3 is upper quartile (75%)
      4. Variance and standard deviationà square root of variance, how spread out the data are from the mean and heterogeneity of the sample (measures of variability)

      Measure of association

      1. Correlation (coefficient).
      • Strength of relationship between 2 variable (can be positive or negative)
      • [-1,1] magnitude says something about the strength of the relationship
      • Coefficient of +1 or -1: two variables are perfectly positively (negatively) correlated. As one increases the other increases (decreases) by a proportionate amount
      • Coefficient of 0: no linear relationship between two variables. As one variable changes the other stays the same
      • Significance (significantly different from 0 or not)
      • Word of warning: direction from causality (from A to B or the other way around) 3rd variable problemà and unmeasured variable that influences the other two. So correlation is not always causality.

      For graphs:

      • Use an appropriate scaling
      • Make sure you graph is self-explanator

      Article:

      What are we looking at:

      • 433 cross border M&A announcements
      • 58 EMMs
      • 1991-2004
      • Sample: Range of industries, mainly from latin America and Asia. Two firms from HUN and three from ZAF. 78.9% of the transactions initiated by Asian EMM’s
      • Data: Thomsom SDC Platinum Database
      • How to include the descriptive statistics,: First what you example then respectively mean, media, standard deviation and then Kurtosis and skewness.
      • You can also include correlations
      • Event study methodology. To see the impact of announcements on the value of the acquiring firm. Conflicting evidence (opportunities and  challenges. Diverse conditions and operation flexibility (+) Post acquisitions integration and liability of foreignness  (-). However Cross-border expansions of EMMs do not create value.
      • Analysis of a cross-sectional sample of firms: To see what influences the market reaction. It has an effect on bidder value. Positive: target size, ownership structure of the target (private vs. public and structure of the bidder. Negative: high-tech nature of the bidder, pursuit of target in related industries
      • Limitations: assumption: The market response is instantaneous, complete and unbiased. Regional concentration of the parent companies in Asia and Latin America à can we generalize  (external validity)? Focus on the bidder’s perspective à But what happens if we look at the combined value of the bidder and the target.
      Summary lecture 3, Emperical research project for IB

      Summary lecture 3, Emperical research project for IB

      Lecture 3:
      research process:

      • Research questions, motivations and contribution
      • Hypotheses
      • Data
      • Analysis
      • Results
      • Check whether the results confirm hypothesis

      Correlation:

      • Correlation coefficient r
      • Bounded measure

      Regression analysis:

      1. simple bivariate regression: one single variable
      • x1: explanatory variable; independent variable; regressor; predictor variable …
      • y: dependent variable and outcome variable
      • B0: intercept/constant
      • B1: slope; regression coefficient for the variable x1 (size or magnitude of the effect x1 on y)
      • E: error term
      • We want to estimate B0, B1 ( you have estimated veriables

      How to estimate:

      • OLS: technique to best represent the linear relationship between x  and y (minimize the sum of standard deviations form the line) (minimize the sum of the squared residuals)
      • How to draw a straight line: Find the one that best fits the data (goes through, or as close to, as many of the data points as possible)
      • Minimize sum of the estimated errors

       

      1. Multiple regression: two or more independent variables: we analyze the relation relationship between a single dependent variable and several independent variables
      • The coefficient of each independent variable indicates change in dependent variable when the independent variable changes but all other independent variables remain constant (ceteris parisbus)

      Dummy variables independent variables:

      How many dummies to include:

      With intercept include m-1 dummies the omitted one is the base/reference,  However if you have only two categories one dummy is sufficient to represent both. Otherwise, one is perfectly predicted  by the other (exact linear relationship) and the regression coefficients cannot be estimated.

      Regression coefficients:

      What to look for when interpreting:

      1. Sign
      2. Significance
      3. Size, magnitude
      4. Then interpret
      • The regression coefficients b1 coefficient measures the amount of change in y due to change x1 while the other regressors are held constant (ceteris paribus)
      • It represents the change in the DV for a one-unit increase in the IV while the other regressors are held constant.

      Article:

      What is about:

      • LMEs vs CMEs
      • Flexible vs rigid labour markets.
      • What does this mean for management bureaucracies: starting point

      Hypothesis:

      • High shares of flexible workers à high shares of managers
      • Labour market flexibility à lack of trust, loyalty, commitment à management and control required
      • Labour market flexibility à Lack of learning and knowledge accumulation à bad for innovation and labour productivity growth

      Why are we doing this:

      • Insignificant or conflicting results in literature
      • Sparse empirical work on topic
      • Important to test the theory at the firm level
      • The Netherlands represent an interesting case to do so: CME but with LME features

      What are we looking at:

      • Data source: SCP organization-level survey data. For the Netherlands à Data collection: telephone interviews plus a postal survey.
      • Information on:
      1. Organizations
      2. From various sectors
      3. Percentage of employees with managerial positions in an organization
      4. Shares of flexible workers
      5. Industry
      6. Firm size
      • Survey of 2009-10 is used

      Theory à variables à data à model

      • The substantial variation in shares of flexible workers (IV à combination of % of temporary workers and %  of manpower agency workers plus freelancers) allows analysing the impact of the latter on management rations (DV), controlling for a number of influential factors ( Fim size, growth, age, dummies for: restructuring operation; competitive market; sensitive to business fluctuations; industry dummies. How am I Supposed to know? à look at literature

      A note on the regression equation:

      • The hypothesis lead to the regression equation
      • Sometimes the regression equation is explicitly presented in the paper
      • Here, the regression equation is not presented, but you should be able to infer it from the text and/or regression table
      • Choose key IV and then look at regression coefficient. Also look at significant level

      Assessing your model:

      • Goodness fit
      • Coefficient of determination (R-squared):
      1. Proportion of the variation in y that is explained by the linear combination of the x variables
      2. Between 0 (no prediction) and 1 (perfect prediction)
      3. In a bivariate linear regression, it equals the squared correlation coefficient
      4. If r2 between 0.21 and .30, 30% of the variation in y is explained by the x’s
      • R2 increases as we keep adding variables
      • Adjusted coefficient of determination: Adjusted r2:
      1. Modified measure of R2 that takes into account the number of independent variables and sample size
      2. Some punishment for the inclusion of additional explanatory variables
      3. When comparing between models, search for highest r2
      • F test: A test for the overall significance of the model, tests whether all parameters are jointly zero b1=b2à compare to critical values of F-distribution or look at corresponding p-value or significance

      Estimates and significance:

      • Does my main IV have a significant impact on the DV
      • How can I determine this: Statistical significantce (Compare to 0 and t-test)
      • Note: statistical vs economic significane
      • First formulate Null and alternative hypothesis H0 bj=0 h1: bj is not 0
      • Choose the significance level
      • Option 1:

       

      • Option 2:

      •  So if p-value<0.01, the estimate is significantly different from 0
      Summary lecture 4, Emperical research project for IB

      Summary lecture 4, Emperical research project for IB

      Lecture 4:

      1. Multicollinearity: High correlation between at least two independent variables. Correlatation:
      • Strength of the relationship between two variables
      • Positive or negative
      • Values between -1 and +1

      Special case for multicollinearity:

      • Perfect multicollinearity, meaning perfect correlation and OLS is not possible. Example: include a dummy for every possible group/category instead of including one

      More common:

      • Multicollinearity, high correlation but not perfect correlation. Examples: include variables that are lagged values of one another of include variables that capture similar phenomena.
      • Multicollinearity-intuition: In general nothing wrong with including correlated variables, but if x1 and x2 are perfectly/highly correlated, it is hard to identify the effect of x1 on y. Because whenever x1 changes, x2 changes with it.

      What happens if we do OLS anyway?

      • Hard to identify the individual impact of each x (which variable should take the credit for explaining variables in y)
      • Larger standard errors and insignificance.
      • Nonsensical coefficient signs and magnitudes (unbelievably large estimated and wrong/counterintuitive signs. So we cannot trust the results)

      How can we check whether we have a multicollinearity problem?

      • Check the data before estimation
      • Correlation coefficients/Correlation matrixàCorrelation coefficients around 0.7-0.8 signal multicollinearity (But are absolute)  
      • Variance inflation factor (VIF) à For one, but can also be for all. Measure by 1/(1-r2). Values higher than 10, multicollinearity is a problem
      • For multicollinearity only look at the IV

      How can the problem be solved?

      • Increase the sample size, but, is that feasible?
      • Drop one of the variables. If two variables measure the same thing, robustness check: first estimate the model with one variable, then with the other (similar results). But the question is whether you dropped an important/relevant variable
      • Transform the highly correlated IVs. Log transformation, create a composite variable, combine collinear IVs. But does that make sense for your model? What about the interpretation of the regression coefficient?
      1. Heteroscedasticity:  OLS assumption. Variance of the error term is constant over various values of the IVs. Dispersion of the error remains the same over the range of observations.
      2. Heteroscedasticity: The OLS assumption does not hold. The error term does not have a constant variance. Dispersion of the error changes over the range of observations. à different variances

      Problem:

      • Dispersion of the error changes over the range of observations
      • Why: Group of observations are different; follow different processes; different error terms

      What happen if we do OLS anyways

      • OLS assumption violated
      • Biased standard errors
      • Unreliable t-statistc
      • Unreliable significance test
      • Misleading conclusions about significance
      • OLS estimators are not efficient

      How to test for heteroscedasticity problem?

      • Breusch-Pegan test
      • White test
      • Scatterplot of residuals: make a graph with the IV and the residuals of your regression. For each independent variable (multiple regression: Multiple plots). You want to see most of the scores concentrated in the center, no systematic patterns

      How can we solve the problem?

      • Weighted least squares: alternative estimator, each observation is weightedà observations with a higher (lower) variance get a lower (greater) weight in determining the regression coefficients.
      • Calculate robust standard errors: adjust OLS standard errors for heterosceadaticity

      Article:

      • Corporate governance of foreign subsidiaries in MNEs (starting point)
      • Focus on subsidiary boardà Key for a sound corporate governance
      • How à Oversee performance on behalf of HQs, review strategic plans and internal policies, help integrate the subsidiary into the MNE, facilitate access to resources and knowledge
      • Why are we doing this? à scant academic research on the roles of subsidiary boards and the factors that affect these roles, lack of empirical evidence, inconclusive results (we really don’t know much about it). Interesting and relevant for academics ads practitioners.
      • In general: investigate the determinants of the roles of subsidiary board in MNEs, More specifically: Dual focus: strategy of the subsidiary and nationaly of the subsidiary directors
      • Sample/data: survey data/questionnaires to CEOs. Bel-first database: subsidiaries operating in Belgium (1 host country) (HQS in 14 countries). Only the largest subsidiary (if more than one), more than 50 employees, not in financial industries. Final result à 428 subsidiaries with 83 responses
      • Four roles of a subsidiary board:
      1. Control: monitor decisions and evaluate performance of subsidiary
      2. Strategy: provide advice
      3. Coordination: transfer information/knowledge between HQs and subsidiary
      4. Service: provide local knowledge, access to local resources
      • Three strategic types of foreign subsidiaries:
      1.  Local implementer: only in the local market, activities independently from the rest of the MNE
      2. Specialized contributor: routine tasks, highly integrated into MNE operations
      3. World Mandate: responsible for a broad scope of activities, involved in corporate strategy
      • From theory to hypothesis:
      1. Agency and resource dependence theory (H1a-d). The subsidiary is more involved in the control, strategy, coordination and service role, the local implementer subsidiary than in world mandate  and specialized contributor subsidiaries
      2. Board internationalization and resource dependence theory (H2a-d). The board of the local implementer subsidiary is more involved in the control, strategy and coordination but less involved in service role. When more subsidiary directors are HQs country nationals
      • In other words how to test the hypothesis?
      • Dependent variables: constructed based on questions on control, strategy, service and coordination
      • Independent variable: Local implementor or not à dummy, HQs country directors, proportion of directors who are HQs country nationals
      • Other/control variables (what does literature say?)à subsidiary size, wholly owned, CEO tenure, HQs country
      • From variable to analysis: OLS regression analysis, robust standard errors (heteroskedasticity)
      • Are the hypothesis confirmed?
      1. Hypothesis 1 is partly confirmed à subsidiary board, more involved in control, strategy and service roles, in local implementer subsidiariess
      2. Hypothesis 2 is confirmed or strategy and coordination à board of the local implementer subsidiary, more involved in strategy and coordination roles when more directors are HQs country nationals.
      Summary lecture 5, Emperical research project for IB

      Summary lecture 5, Emperical research project for IB

      Lecture 5:

      1. Ouliers: Data point that does not follow the general trend of the data (extreme value)

      What does happen if we run the regression anyway?

      • The fit of the model can change
      • The regression may be titled
      • You can remove an outlier

      How can we check whether we have outliers

      • Scatterplots
      • Statistical test
      • Easiest wat the range of +/- 2 à 3 standard deviatons include the word at least in the conclusion! à based on the assumption that it distribution is normal

      How can we solve the problem?

      • First think: what is the reason for the outlier, can/could you do something?
      • Throw the outlier out of the dataset, however mismeasurement, error in the observation, data entry error. But not because it’s convenient to do so.
      • Be careful: some extreme values are to be expected, indicative of the characteristics of the population. Therefore it is important to check how sensitive your results are to the presence of the outlier? à what happens if we keep the outlier, what happens if we omit the outlier.
      • If the outlier does not change the results, but does affect assumptions, you may drop the outlier
      • If it affects both results and assumptions, you may not drop the outlier, but you have to run the regression both with and without the outlier and say that in he paper
      • If a relationship is clearly created by the outlier, you may drop the outlier, because without it there would be no relationship between x and y. So the regression coefficient does not truly describe the effect of x on y
      1. Reverse causality: We assume that changes in the dependent variables are caused by changes in the independent variables. But we only find a statistical relationship, says nothing about causality of the direction of causality. In some analysis is could be that y (also) causes X which is called reverse causality à cause endogeneity problem

      How can we check whether there is a reverse causality problem?

      • What does the theory say
      • Timing of measurement: Theory says x causes y, but sometimes x is measured later than y
      • Statistical tests (to check whether changes in x precede changes in y) and some more advances techniques

      What to do:

      • Have a model that is well-grounded in theory
      • Explain
      • Acknowledge
      • In general: advances econometric techniques also exist to mitigate the problem of endogeneity
      1. Omitted variable bias: which variables to include as IVs and what happens if we omit relevant variables? You have omitted variables if:
      • As excluded variable has some effect on your DV and
      • It’s correlated with at least one of your IVs (endogeneity)

      It is impossible to control for everything, so how do we solve the problem?

      • Avoid simple regressions models (with on IV)
      • Include variables that are likely to be the most important theoretically in explaining the DV (what does the literature say)

      Panel data or longitudinal data: data on many units collected at several points in time, whereby each unit is observed several times. You also have cross sectional and time series dimensions.

      Why panel data:

      1. Rich in information
      2. Potentially, an increase in sample size
      3. Possibility to control for time-invariant effects correlated with the regressors
      4. How> intuition: Include dummy variables for each cross-section unit and use fixed effects.
      5. Mitigate omitted variable bias

      Fixed effects model: is a statistical regression model in which the intercept of the regression model in which the intercept of the regression model is allowed to vary freely across individuals or groups. It often applied to panel data in order to control for any individual-specific attributes that do not vary across time. Remove omitted variable bias. Assumption: the individual-specific effects are correlated with the IV’s

      Assume: For the Grundfeld data we concluded that the assumption of OLS regression that the investment behaviour of all firms in all years is the same à is not realistic. The fixed effects model offers another way of restricting that assumption, namely by assuming that each firm has a number of unique characteristics that influence the firm’s investment behaviour. These unique characteristics are caught in the model by including for each firm a separate dummy variable.

      In example of Grunfeld we assume:

      • Each firm has a unique characteristic which is stable over time
      • Random error term is assumed to satisfy the usual OLS assumptions
      • Hence each firm I gets a different intercept parameter but the slope coefficient b2 and b3 are assumed to be the same for all firms
      • An easy way to estimate the model is to create for each firm a dummy variable and add thse dummies to the model

      General equation FE model

      Restrictions of FE model, the FE model is very powerful but:

      • We cannot include variables that do not vary over time, all stable characteristics are captured by dummies, it leaves not variation left for estimating effects of variables that vary between economic entities
      • You can only include those that change over time
      • However you can still examine the interaction between group dummies and time-varying variables in FE model

      When should you use a Fe model à if you are concerned about omitted factors that may be correlated with key predictors at the group level

      Interpretation of results à similar to OLS

      Logs in the regression equation, in general don’t forget:

      • Sign-size significance
      • Use the unit of measurement of y and x when given
      • Ceteris paribus

      4 situations:

      Robustness/sensitivity analysis:

      To what end? à determine how sensitive your results are to change in the model

      Experiment with:

      • Combinations of (other) control variables
      • Datasets
      • Time frames

      Always rely on theory and literature

      Do you results remain, results are robust.