rm(list=ls())
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 514580 27.5 1148107 61.4 644242 34.5
## Vcells 920313 7.1 8388608 64.0 1637054 12.5
library(conjoint)
library(skpr)
## 필요한 패키지를 로딩중입니다: shiny
library(AlgDesign)
library(mlogit)
## 필요한 패키지를 로딩중입니다: dfidx
##
## 다음의 패키지를 부착합니다: 'dfidx'
## The following object is masked from 'package:stats':
##
## filter
library(texreg)
## Version: 1.38.6
## Date: 2022-04-06
## Author: Philip Leifeld (University of Essex)
##
## Consider submitting praise using the praise or praise_interactive functions.
## Please cite the JSS article in your publications -- see citation("texreg").
library(dotwhisker)
## 필요한 패키지를 로딩중입니다: ggplot2
library(ggplot2)
library(jtools)
library(grid)
library(gridExtra)
library(dplyr)
##
## 다음의 패키지를 부착합니다: 'dplyr'
## The following object is masked from 'package:gridExtra':
##
## combine
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
참고사이트: https://cran.rstudio.com/web/packages/ExpertChoice/vignettes/practical.html
reference: Street, D.J., Burgess, L. and Louviere, J.J., 2005. Quick and easy choice sets: constructing optimal and nearly optimal stated choice experiments. International Journal of Research in Marketing, 22(4), pp.459-470.
컨조인트 분석(Conjoint Analysis)은 실험설계에 의해 구성된 다속성자 극물(multiattribute stimuli)1)에 대한 소비자의 선호를 수리적으로 분석하여 어떤 제품이 갖고 있는 속성(attribute) 하나 하나에 고객이 부여하는 효용(utility)을 추정함으로써, 그 고객이 선택할 제품을 예측하기 위한 기법이다. Hair, Anderson, Tatham and Black(1995)에 의하면 컨조인트 분석은 응답자가 재화 또는 서비스에 대한 선호를 어떻게 나타내는지를 이해하기 위해 특별하게 사용된 다변량기법(multivariate technique)이라고 정의한다. 즉 소비자가 원하는 속성을 분석하여 상품이나 서비스의 속성 을 최적으로 구성하는데 이용될 수 있는 방법이다.
그러므로 컨조인트 분석은 특정대상의 속성들에 대한 평가를 통해 이들이 어느 정도 공헌하고 있는지 분석하는 것이다. 이 분석은 각각의 독립변수(각 속성)가 전체의 종속변수(효용)에 가산적(additive)으로 공헌한다는 가정하에 분석된다.
기본절차 속성과 속성수준 선정, 분석모델의 결정, 자료수집방법 선정, 자극물 제시 방법선택, 모수추정방법 선정, 결과 해석의 순서로 이루어진다. 다음에서 이를 구체적으로 살펴 보고자 한다.
컨조인트 분석을 실시하기 위하여 먼저 소비자가 제품을 구입할 때 고려하는 속성(독립변수)들이 무엇인지를 알아야 한다. 평가대상의 선호도나 효용을 측정하는데 필요한 속성들을 규명하기 위하여 주로 이용되는 방법은 평가자들에게 중요하게 생각하는 속성(독립변수)들이 무엇인지 직접 물어보는 방법이다. 이외에도 표적집단면접(FGI)이나 심층면접(Depth Interview)을 이용하거나 전문가들의 조언을 통하여 선정하는 방식을 취할 수 있다. 본고의 적용 예에서는 소비자들이 아직 인터넷 구매 경험이 부족하기 때문에 전자상거래 관련 전문가들의 판단을 통해 속성을 선정하 는 방법을 사용하였다.
이상과 같은 방법을 통하여 관련된 속성들을 선정하는데, 이 과정에서 가장 중요한 것은 분석의 효율성을 고려하여 속성의 수를 중요한 몇 개로 축소하는 것이다.
속성의 수와 명칭이 정해지고 나면 개별 속성들의 수준을 결정해야 한다. 속성수준을 결정할 때 고려해야 할 사항은 각 속성별 수준의 수와 각 수준간의 적정차이의 결정이다. 여기에서 각 속성 수준의 수에 따라 응답자가 비교 평가하여야 할 총 속성의 수가 정해짐을 고려해야 한다. 만약 세 가지의 속성이 선정되었을 경우 두 속성은 3개의 수준으로, 한가지 속성은 2개의 수준으로 나누었다면 응답자에게 제시되는 총자극의 수는 32×2=18개가 된다. 총자극의 수가 너무 많아지면 응답자가 응답에 곤란을 겪게 되고, 반대로 총자극의 수가 너무 적어지면 모델의 추정오차가 커지게 된다.
따라서 적정수준의 자극의 수를 사용하는 것이 바람직하다. 다음 으로 고려할 사항은 각 수준을 어떻게 나눌 것인가이다. 각 수준들이 응답자에게 거의 차이가 없으면 수준을 나누는 의미가 없어지기 때문에 가능한 한 수준을 나눌 때는 현재 출시되고 있는 제품들의 수준을 반영해주는 것이 바람직하다.
두 번째 단계는 분석모델을 결정하는 것이다. 분석모델이란 선호도를 계산하는 방법으로 이상점모델(ideal point model), 벡터모델(vector model), 부분가치함수모델(part worth function model)이 있다. 각 모델들은 다음과 같이 설명될 수 있다. 예를 들어 다음과 같이 속성 \(p\)를 정의한다고 하자.
\[ p = 1, 2,.......................................,t\]
식 (1)을 선정된 \(t\)개 속성의 집합이라고 하고, \(y_{jp}\)는 \(j\)번째 자극에서 \(p\)번째 속성의 수준이라고 하자. 여기서 \(y_{jp}\)는 연속적(continuous) 변수일 수도 있고, 명목적(categorical) 변수일 수도 있다.
먼저 벡터모델(vector model)에서는 \(j\)번째 자극에 대한 선호도(\(S_j\))가 다음과 같이 나타난다.
\[S_j = \Sigma{W_p y_{jp}}\]
여기서 \(W_p\)는 \(t\) 속성에 대해 개인마다 다르게 부여하는 가중치이다. 따라서 벡터모델은 소비자의 선호도가 대상이 지니고 있는 각 속성수준과 이에 부여되는 가중치에 의해 결정된다고 보고 있다. 만약 벡터의 속성과 같이 속성에 대한 점수들이 무한히 커지면 선호도도 커지게되므로 이는 다 속성태도모델인 피시바인-로젠버그모델(Fishbein- Rosenberg Model)과 같다. 피시바인-로젠버그 모델에서도 각 대상들이 가지고 있는 속성의 양이 많으면 많을수록 대상에 대한 태도는 좋아진다.
이상점모델(ideal point model)은 선호도(\(S_j\))가 \(j\)번째 자극의 \(y_{jp}\)와 개인의 이상점인 \(x_P\) 사이의 거리의 제곱(\(d^2_j\))과 음(-)의 관계에 있다고 가정한 다. 여기서 \(d^2_j\)은 다음과 같이 정의한다.
\[d^2_j = \Sigma{W_p}(y_{jp} - X_p)^2 \] 따라서 이상점에 가까운(작은 \(d^2_j\)) 자극을 더 선호하게 된다.
부분가치함수모델(part worth function model)을 식으로 표현하면 다음과 같다.
\[S_j = \Sigma{f_p(y_{jp})}\] 여기서 \(f_p\)는 \(p\)번째 속성의 여러 수준에 대한 부분가치를 나타내는 함수이다. \(f_p(y_{jp})\)는 보통 3~4개의 \(y_{jp}\)를 이용하여 추정하는데, 각 속성수준(\(y_{jp}\))에 대한 부분가치는 선형보간법(linear interpolation)에 의해 얻어지므로 부분가치함수는 꺽인 선형함수의 형태를 띠게 된다. 추정범위 밖에 존재하는 \(y_{jp}\)에 대한 부분가치는 꺽인 선형함수의 외삽법(extrapolation)에 의해 구할 수는 있지만 타당성은 보장되지 않는다.
부분가치함수모델은 각 속성수준에 대한 선호함수가 다양한 형태를 가질 수 있으므로 상당한 융통성을 지닌 모델이다. 왜냐하면 \(f_p(y_{jp}) = w_p(y_{jp} - x_p)^2\)으로 정의하면 이상점 모델이 될 수도 있고, \(f_p(y_{jp}) = w_py_{jp}\)로 정의하면 벡터모델을 얻을 수도 있기 때문이다. 한편, 이상점모델은 벡터모델 보다 융통성이 있다. 왜냐하면 벡터모델은 이상점모델의 특수한 형태, 즉 \(x_p → \pm\infty\) 인 경우이기 때문이다.
부분가치함수모델은 융통성이 있기는 하지만 더 많은 모수를 추정해야 하는 부담(신뢰성이 낮아짐)과 선형보간법에 의해 중간값을 근사치로 구해야 한다는 단점이 있다. 즉 벡터모델은 추정해야 할 모수(\(w_p\))가 \(t\)개이고, 이상점모델은 \(w_p\)와 \(x_p\)가 있으므로 \(2t\)개인 반면에, 부분가치함수모델은 각 속성마다 \(q\)개의 수준이 있으면 \((q-1)t\)개의 모수를 추정해야 하는 것이다. 여기서 \((q-1)\)이 되는 이유는 \(f_p(y_{jp})\)를 모델을 변경시키지 않고 \(f_p(y_{jp}) = a_p\)로 나타낼 수 있으므로 첫 번째 수준에 대한 부분가치는 0의 값을 가지기 때문이다.
컨조인트 분석을 위해서는 자극(속성수준)을 이용하여 분석에 필요한 순위자료를 응답자로부터 얻어야 한다. 순위를 이용한 자료는 서열척도이며 응답자가 가장 좋아하는 것을 1, 그 다음 좋아하는 것을 2로 하여 제품의 선호도를 얻을 수 있다.
이 때 속성의 수나 속성수준이 많아지면 순위척도에 의한 응답이 현실적으로 불가능하므로, 자료수집의 어려움을 덜기 위한 방법으로서 다음의 몇가지 방법이 사용된다.
첫째, 전체자극들을 우선 ‘가장 선호’, ‘중간선호’, ‘싫어하는 선호’ 순으로 분류한 후 각 분류집단내에서 자극들의 선호순위 를 정하고 각 집단간의 경계부분에 있는 자극들을 비교하여 순위를 확정하는 방법이다. 이 방법은 응답자의 비교범위를 축소시켜 주기 때문에 전체자극의 수가 많을 경우에 널리 사용되고 있다.
둘째, 속성의 수가 많은 경우 전체 속성을 모두 고려하는 총체적 접근법(full profile approach)이 아닌 두 속성씩 비교하는 트레이드오프 제시법(trade-off method)을 사용하기도 한다. 이 방법은 응답자가 한번에 처리하여야 하는 정보의 양을 줄여주는 장점이 있으나, 전체적으로 평가하여야 하는 자극의 수는 많아 지며, 하나의 자극을 한번에 평가하지 않으므로 평가간에 모순 (intransitivity)이 생길 수 있다.
셋째, 전체 대상간에 서열척도를 구하는 대신 각 자극들의 선호도를 등급법이나 리커트 형태의 척도에 의해 구한 후 이를 다시 서열척도로 변환하는 방법이다. 이 방법도 자주 사용되나 자극들간에 같은 응답이 많이 나오기 때문에 자극의 수가 많아지면 사용하기 불편하다.
넷째, 직접 비교대상이 되는 자극의 수를 줄이는 방법이다. 이 방법은 각 속성단계들간의 상호작용을 최소화하면서 추정에 필요 한 자극의 수를 줄이는 방법으로 부분요인설계(fractional factorial design)를 하는 방법이다.
다음 과제는 이러한 조합 또는 자극이 응답자에게 표현되는 방식을 선택하는 것이다. 자극을 응답자에게 제시하는 방법은 어구적 표현(verbal description), 산문적 표현(paragrape design), 회화적 표현(pictorial presentation)이 있다. 이외에도 PC를 이용하여 체계적으로 각 속성의 조합을 제시하는 방법도 있다. 사실 어떠한 방법이나 큰 차이가 없으나 연구의 취향이나 평가자의 수준에 따라 적절한 것을 사용하면 좋을것이다.
컨조인트 모수추정방법은 척도의 종류에 따라 크게 2가지 범주로 나눌 수 있으며, 각 방법들은 다음과 같은 특징들이 있다.
첫째, 자료가 넌메트릭(nonmetric)일 때 모수를 추정하는 대표적인 방법으로는 MONANOVA, PREMAP, LINMAP 등이 있으며, 이 방법들은 응답자가 선호도에 따라 각 프로필에 순위를 부여한 경우에 적합하다.
둘째, 자료가 메트릭(metric)일 때 모수를 추정하는 대표적인 방법은 회귀분석이 다. 이 방법은 응답자가 선호도에 따라 각 프로필에 점수를 부여한 경우에 적합하다.
그러나 시뮬레이션을 이용한 여러 연구 결과에 따르면, 종속변수의 종류에 관계없이 어느 방법을 사용하더라도 예측타당성에는 거의 차이가 없는 것으로 밝혀졌다. 예를 들어, 종속변수가 순위인 경우에 회귀분석을 이용하여 계수를 추정하더라도 넌메트릭 자료를 이용하여 추정한 것과 거의 같은 예측 타당성을 보인다는 것이다(Green and Srinivasan 1978). 넌메트릭 자료들은 회귀분석에 비해서 널리 보급이 되어 있지 못하고 또 회귀분석에 비하여 계산속도가 느리기 때문에 많은 연구들이회귀분석을 이용하고 있다
넌메트릭 자료들 가운데에서는 LINMAP이 다른 방법들에 비하여 여러 가지 바람직한 특성을 가지고 있다. 우선, LINMAP은 선형계획법(linear programming)을 사용하고 있기 때문에, 대수학(calculus)에 의존하고 있는 다른 방법들과는 달리, 계수를 추정할 때 전체최적(global optimum)을 달성할 수 있다. 또한 LINMAP을 사용하면 부분가치함수에 각종 제약조건(constraint)들을 부과하여 벡터모델이나 이상점모델로 바꿔 주므로써, 추정되는 계수의 수를 줄여서 예측타당성을 향상시킬 수 있는 반면, 다른 방법들을 사용하면 이같은 제약조건들을 부가하는 것이 불가능하다. 특히 MONANOVA는 부분가치함수모델에만 모수추정이 가능한 방법이며, 요인설계(factorial design)에 의해 수집한 자료를 변환시키는 과정을 통해 모수를 추정한다.
자료분석을 통한 결과 해석의 주요 내용은 전형적으로 다음과 같은 것을 포함한다.
첫째, 각 응답자에 대한 각 속성수준의 부분가치(효용) 추정,
둘째, 인구통계적(demographic) 변수를 이용한시장세분(segments) 분,
셋째, 속성의 상대적 중요도 결정,
넷째, 초이스 시뮬레이션(choice simullation),
다섯째, 소비자 만족도 향상을 위한 쇼핑모형을 찾아보는 것 등이다.
This DCE construction is described in the Theory Introduction to ExpertChoice vignette. It is intended to show more succinctly (than the silver example above) how to design such an experiment
library(ExpertChoice)
library(DoE.base)
## 필요한 패키지를 로딩중입니다: conf.design
## Registered S3 method overwritten by 'DoE.base':
## method from
## factorize.factor conf.design
##
## 다음의 패키지를 부착합니다: 'DoE.base'
## The following objects are masked from 'package:stats':
##
## aov, lm
## The following object is masked from 'package:graphics':
##
## plot.design
## The following object is masked from 'package:base':
##
## lengths
#Step 0
# Described in Theory
kim <- list(
length = c("1", "2", "3"),
wage = c("1", "2", "3"),
work = c("1", "2", "3")
)
kim
## $length
## [1] "1" "2" "3"
##
## $wage
## [1] "1" "2" "3"
##
## $work
## [1] "1" "2" "3"
# Step 1
kim_examp <- full_factorial(kim)
kim_examp
# Step 2
akim_examp <- augment_levels(kim_examp)
## Applying B mat
#> Applying B mat
#write.csv(ff_examp, "example.csv")
# Step 3
nlevels <- unlist(purrr::map(kim_examp, function(x){length(levels(x))}))
#oa_feasible(36, nlevels, strength = 3)
fractional_factorial_kim <- oa.design(nlevels = nlevels, columns = "min34")
# The fractional_factorial design is generated using the DoE.MIParray package.
# The following is the command to run this generation.
# The result is saved in the package.
# Step 4
# Confirming that this is an efficient design.
colnames(fractional_factorial_kim) <- colnames(kim_examp)
fractional_factorial_kim <- search_design(kim_examp, fractional_factorial_kim)
# Step 5.
# This table is reported as Table
# Confirm that this design supports all interactions.
row1_main_effects <- fractional_factorial_efficiency(~ length + wage + work, fractional_factorial_kim)
## Your fractional factorial design has an A-efficiency of100%
## Your fractional factorial design has a D-efficiency of100%
#> Your fractional factorial design has an A-efficiency of100%
#> Your fractional factorial design has a D-efficiency of100%
# Step 6.
# Two different card options
# Option 1
dce_modulo_examp1 <- modulo_method(
fractional_factorial_kim,
list(c(1, 0, 1), c(0,1,0))
)
# Option 2.
dce_modulo_examp2 <- modulo_method(
fractional_factorial_kim,
list(c(1, 0, 1), c(1,2,1), c(0,2,0))
)
# Step 7
# This experiment uses categorical data (not ordinal) hence there can be no pareto dominate solution.
# Each category is merely a choice.
# Step 8.
# Compare the efficiencies.
dce_efficency_menu_example1 <- dce_efficiency(akim_examp, dce_modulo_examp1)
##
## q is1
## L is3
## Case 3
## The implied x is1and y is0
## s is3
##
## q is2
## L is3
## Case 3
## The implied x is1and y is0
## s is3
##
## q is3
## L is3
## Case 3
## The implied x is1and y is0
## s is3
##
##
## The D-efficiency of this discrete choice experiment is66.667%
dce_efficency_menu_example2 <- dce_efficiency(akim_examp, dce_modulo_examp2)
##
## q is1
## L is3
## Case 3
## The implied x is1and y is1
## s is5
##
## q is2
## L is3
## Case 3
## The implied x is1and y is1
## s is5
##
## q is3
## L is3
## Case 3
## The implied x is1and y is1
## s is5
##
##
## The D-efficiency of this discrete choice experiment is80%
# Step 9
# Construct the question table
kim_question_table <- construct_question_frame(akim_examp, dce_modulo_examp1)
kim_question_table2 <- construct_question_frame(akim_examp, dce_modulo_examp2)
# Finally augment the question table. See Table 1 in the Theoretical Vignette.
levels(kim_question_table$length) <- c("1년 연장", "2년 연장", "3년연장")
levels(kim_question_table$wage) <- c("60%급여", "60%급여", "50%급여")
levels(kim_question_table$work) <- c("근로시간단축", "편한 직무로 보직전환", "계약제로 재고용")
kim_question_table
write.csv(kim_question_table, "kim_question_table.csv", row.names=TRUE, fileEncoding = 'cp949')
# install.packages("conjoint")
library(conjoint)
experiment<-expand.grid(length=c("1년 연장", "2년 연장", "3년연장"),
wage=c("80%급여", "70%급여", "60%급여"),
work=c("근로시간단축", "편한 직무로 보직전환", "계약제로 재고용"))
design=caFactorialDesign(data=experiment,type="full")
print(design)
## length wage work
## 1 1년 연장 80%급여 근로시간단축
## 2 2년 연장 80%급여 근로시간단축
## 3 3년연장 80%급여 근로시간단축
## 4 1년 연장 70%급여 근로시간단축
## 5 2년 연장 70%급여 근로시간단축
## 6 3년연장 70%급여 근로시간단축
## 7 1년 연장 60%급여 근로시간단축
## 8 2년 연장 60%급여 근로시간단축
## 9 3년연장 60%급여 근로시간단축
## 10 1년 연장 80%급여 편한 직무로 보직전환
## 11 2년 연장 80%급여 편한 직무로 보직전환
## 12 3년연장 80%급여 편한 직무로 보직전환
## 13 1년 연장 70%급여 편한 직무로 보직전환
## 14 2년 연장 70%급여 편한 직무로 보직전환
## 15 3년연장 70%급여 편한 직무로 보직전환
## 16 1년 연장 60%급여 편한 직무로 보직전환
## 17 2년 연장 60%급여 편한 직무로 보직전환
## 18 3년연장 60%급여 편한 직무로 보직전환
## 19 1년 연장 80%급여 계약제로 재고용
## 20 2년 연장 80%급여 계약제로 재고용
## 21 3년연장 80%급여 계약제로 재고용
## 22 1년 연장 70%급여 계약제로 재고용
## 23 2년 연장 70%급여 계약제로 재고용
## 24 3년연장 70%급여 계약제로 재고용
## 25 1년 연장 60%급여 계약제로 재고용
## 26 2년 연장 60%급여 계약제로 재고용
## 27 3년연장 60%급여 계약제로 재고용
print(cor(caEncodedDesign(design)))
## length wage work
## length 1 0 0
## wage 0 1 0
## work 0 0 1
library(conjoint)
library(grid)
library(gridExtra)
# declare features and their levels
experiment<-expand.grid(length= c("A",
"B",
"C"),
wage = c("A",
"B",
"C"),
work = c("A",
"B",
"C"))
# create orthogonal design
design1=caFactorialDesign(data=experiment,
type = "orthogonal")
print(design1)
## length wage work
## 2 B A A
## 6 C B A
## 7 A C A
## 12 C A B
## 13 A B B
## 17 B C B
## 19 A A C
## 27 C C C
print(cor(caEncodedDesign(design1)))
## length wage work
## length 1 0 0
## wage 0 1 0
## work 0 0 1
# produce alternatives via shifting
design1 <- data.frame(lapply(design1,
function(x)
as.numeric(x))) # make values numeric first
design2 <- design1
design2[design1 == 1] <- 2
design2[design1 == 2] <- 3
design2[design1 == 3] <- 1
design3 <- design2
design3[design2 == 1] <- 2
design3[design2 == 2] <- 3
design3[design2 == 3] <- 1
# combine the designs to get choice sets
fulldesign <- cbind(design1, design2, design3)
fulldesign_long <- data.frame() # create design in long format
for (i in 1: nrow(design1)){
tmp <- data.frame(choice = i,
stimulus = 1:3,
rbind(design1[i, ], design2[i,], design3[i ,] ) # repeat 0 for no choice and 5 times for 5 variables
)
fulldesign_long <- rbind(fulldesign_long, tmp)
}
# define factors for creating choice cards
## Design 1 (first option in choice sets)
design1$length <- factor(design1$length,
levels = c(1,2,3),
labels = c("1년",
"2년",
"3년"))
design1$wage <- factor(design1$wage,
levels = c(1,2,3),
labels = c("70% 급여",
"60% 급여",
"60% 급여"))
design1$work <- factor(design1$work,
levels = c(1,2,3),
labels = c("근로시간 단축",
"편한 보직으로 전환",
"계약직으로 재고용"))
## Design 2 (second option in choice sets)
design2$length <- factor(design2$length,
levels = c(1,2,3),
labels = c("1년",
"2년",
"3년"))
design2$wage <- factor(design2$wage,
levels = c(1,2,3),
labels = c("70% 급여",
"60% 급여",
"60% 급여"))
design2$work <- factor(design2$work,
levels = c(1,2,3),
labels = c("근로시간 단축",
"편한 보직으로 전환",
"계약직으로 재고용"))
## Design 3 (third option in choice sets)
design3$length <- factor(design3$length,
levels = c(1,2,3),
labels = c("1년",
"2년",
"3년"))
design3$wage <- factor(design3$wage,
levels = c(1,2,3),
labels = c("70% 급여",
"60% 급여",
"60% 급여"))
design3$work <- factor(design3$work,
levels = c(1,2,3),
labels = c("근로시간 단축",
"편한 보직으로 전환",
"계약직으로 재고용"))
# example for first card
attr <- c("연장기간",
"급여수준",
"근로방식")
choice_tab <- data.frame(cbind(attr,
t(design1[4, ]),
t(design2[4, ]),
t(design3[4, ]))
)
colnames(choice_tab) <- c(" ",
"A 유형",
"B 유형",
"C 유형")
# grid.table(choice_tab, theme = mytheme)
g <- tableGrob(choice_tab,
cols = colnames(choice_tab),
rows = NULL,
theme = ttheme_default(base_size = 12,
core = list(fg_params = list(hjust=c(rep(0,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)),
x=c(rep(0.1,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)),
fontsize=9),
bg_params = list(fill = c("grey80", "grey80", "grey80", "grey80", # first column
"grey95", "grey90", "grey95", "grey90", # second column
"grey95", "grey90", "grey95", "grey90") ) # third column
), # end core params
colhead = list(bg_params=list(fill = "lightskyblue1"),
fg_params=list(fontsize = 10))
) # end theme
)# end tableGrob
g$widths <- unit(c(0.10, 0.15, 0.15, 0.15), "npc")
grid.draw(g)
# Test image
png(paste0("Kim_test.png"), width=10, height=3,units="cm", res=400)
grid.draw(g)
# dev.off()
#### Step 4: Loop for creating all choice set cards (this will write image files) ####
# setwd("E:/Conjoint4")
for (i in 1:nrow(design1)) {
print(paste("Kim' Conjoint: ", i))
attr <- c("연장기간",
"급여수준",
"근로방식")
choice_tab <- data.frame(cbind(attr,
t(design1[i, ]),
t(design2[i, ]),
t(design3[i, ]))
)
colnames(choice_tab) <- c(" ",
"A 유형",
"B 유형",
"C 유형")
png(paste0("Kim's_Conjoint", i, ".png"), width=10, height=3,units="cm", res=400)
g <- tableGrob(choice_tab,
cols = colnames(choice_tab),
rows = NULL,
theme = ttheme_default(base_size = 12,
core = list(fg_params = list(hjust=c(rep(0,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)),
x=c(rep(0.1,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)),
fontsize=9),
bg_params = list(fill = c("grey80", "grey80", "grey80", "grey80", # first column
"grey95", "grey90", "grey95", "grey90", # second column
"grey95", "grey90", "grey95", "grey90") ) # third column
), # end core params
colhead = list(bg_params=list(fill = "lightskyblue1"),
fg_params=list(fontsize = 10))
) # end theme
)# end tableGrob
g$widths <- unit(c(0.25, 0.25, 0.25, 0.25), "npc")
grid.draw(g)
dev.off()
}
## [1] "Kim' Conjoint: 1"
## [1] "Kim' Conjoint: 2"
## [1] "Kim' Conjoint: 3"
## [1] "Kim' Conjoint: 4"
## [1] "Kim' Conjoint: 5"
## [1] "Kim' Conjoint: 6"
## [1] "Kim' Conjoint: 7"
## [1] "Kim' Conjoint: 8"
library(conjoint)
library(skpr)
library(AlgDesign)
library(mlogit)
library(texreg)
library(dotwhisker)
library(ggplot2)
library(jtools)
library(grid)
library(gridExtra)
#### Step 1: Create orthogonal design and full choice set matrix #####
experiment<-expand.grid(length= c("A",
"B",
"C"),
wage = c("A",
"B",
"C"),
work = c("A",
"B",
"C"))
# create orthogonal design
design1=caFactorialDesign(data=experiment,
type = "orthogonal")
print(design1)
## length wage work
## 2 B A A
## 6 C B A
## 7 A C A
## 12 C A B
## 13 A B B
## 17 B C B
## 19 A A C
## 27 C C C
print(cor(caEncodedDesign(design1)))
## length wage work
## length 1 0 0
## wage 0 1 0
## work 0 0 1
# produce alternatives via shifting
design1 <- data.frame(lapply(design1,
function(x)
as.numeric(x)))
design2 <- design1
design2[design1 == 1] <- 2
design2[design1 == 2] <- 3
design2[design1 == 3] <- 1
design3 <- design2
design3[design2 == 1] <- 2
design3[design2 == 2] <- 3
design3[design2 == 3] <- 1
# combine the designs to get choice sets
fulldesign <- cbind(design1, design2, design3)
fulldesign_long <- data.frame()
for (i in 1: nrow(design1)){
tmp <- data.frame(choice = i,
stimulus = 1:4,
rbind(design1[i, ], design2[i,], design3[i ,], rep(0, 5) ) # repeat 0 for no choice and 5 times for 5 variables
)
fulldesign_long <- rbind(fulldesign_long, tmp)
}
library(readxl)
df=read_excel("df.xlsx")
psych::alpha(df[c("effect1",
"effect2",
"effect3",
"effect4",
"effect5",
"need1") ],
check.keys=TRUE)
##
## Reliability analysis
## Call: psych::alpha(x = df[c("effect1", "effect2", "effect3", "effect4",
## "effect5", "need1")], check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.96 0.96 0.96 0.81 26 0.003 2.3 0.99 0.8
##
## 95% confidence boundaries
## lower alpha upper
## Feldt 0.96 0.96 0.97
## Duhachek 0.96 0.96 0.97
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## effect1 0.95 0.95 0.95 0.81 21 0.0036 0.0014 0.80
## effect2 0.95 0.95 0.95 0.81 21 0.0036 0.0014 0.81
## effect3 0.95 0.95 0.95 0.80 20 0.0038 0.0011 0.80
## effect4 0.95 0.95 0.95 0.80 20 0.0038 0.0012 0.79
## effect5 0.96 0.96 0.96 0.82 23 0.0033 0.0015 0.81
## need1 0.96 0.96 0.95 0.82 23 0.0034 0.0018 0.81
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## effect1 400 0.92 0.92 0.91 0.89 2.3 1.1
## effect2 400 0.92 0.92 0.90 0.88 2.3 1.0
## effect3 400 0.93 0.93 0.92 0.90 2.3 1.1
## effect4 400 0.93 0.93 0.92 0.90 2.3 1.1
## effect5 400 0.89 0.89 0.86 0.85 2.2 1.1
## need1 400 0.90 0.90 0.87 0.86 2.3 1.1
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## effect1 0.29 0.27 0.30 0.12 0.02 0
## effect2 0.28 0.32 0.27 0.12 0.01 0
## effect3 0.28 0.28 0.30 0.11 0.02 0
## effect4 0.30 0.26 0.30 0.12 0.02 0
## effect5 0.31 0.28 0.28 0.11 0.02 0
## need1 0.30 0.24 0.29 0.14 0.03 0
# generate mean of all items
df$effect <- apply(df[c("effect1",
"effect2",
"effect3",
"effect4",
"effect5",
"need1") ], 1, mean, na.rm=TRUE)
hist(df$effect)
psych::alpha(df[c("need2",
"need3",
"need4",
"need5") ],
check.keys=TRUE)
##
## Reliability analysis
## Call: psych::alpha(x = df[c("need2", "need3", "need4", "need5")], check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.95 0.95 0.94 0.84 20 0.0039 2.6 1.2 0.83
##
## 95% confidence boundaries
## lower alpha upper
## Feldt 0.94 0.95 0.96
## Duhachek 0.95 0.95 0.96
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## need2 0.93 0.93 0.90 0.82 14 0.0059 0.00069 0.82
## need3 0.93 0.93 0.90 0.82 14 0.0060 0.00090 0.81
## need4 0.94 0.94 0.92 0.84 16 0.0053 0.00186 0.82
## need5 0.95 0.95 0.93 0.86 19 0.0044 0.00049 0.85
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## need2 400 0.95 0.95 0.93 0.91 2.6 1.3
## need3 400 0.95 0.95 0.93 0.91 2.6 1.3
## need4 400 0.93 0.93 0.90 0.88 2.6 1.3
## need5 400 0.91 0.91 0.87 0.85 2.6 1.2
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## need2 0.29 0.18 0.25 0.23 0.05 0
## need3 0.30 0.17 0.25 0.22 0.06 0
## need4 0.30 0.17 0.26 0.20 0.07 0
## need5 0.28 0.19 0.26 0.22 0.05 0
df$need <- apply(df[c("need2",
"need3",
"need4",
"need5") ], 1, mean, na.rm=TRUE)
hist(df$need)
library(NbClust)
# criteria <- NbClust(data = df[ , c("effect", "need") ], diss = NULL, distance = "euclidean", min.nc = 2, max.nc = 10, method = "kmeans", index = "all", alphaBeale = 0.1)
# K-Means Cluster Analysis
kclust <- kmeans( df[ , c("effect", "need") ] ,
3,
nstart=10 ) # 3 cluster solution
df$kgroup <- kclust$cluster
# get cluster means (shown in Annex A7)
aggregate(df[, c("effect", "need")],
by=list(df$kgroup),
FUN=mean)
table(df$kgroup)
##
## 1 2 3
## 75 163 162
table(df$kgroup, df$effect)
##
## 1 1.16666666666667 1.33333333333333 1.5 1.66666666666667 1.83333333333333
## 1 0 0 0 0 0 0
## 2 8 0 1 2 4 2
## 3 71 10 14 10 6 15
##
## 2 2.16666666666667 2.33333333333333 2.5 2.66666666666667 2.83333333333333
## 1 1 0 0 0 3 4
## 2 14 9 15 21 13 17
## 3 24 5 4 2 1 0
##
## 3 3.16666666666667 3.33333333333333 3.5 3.66666666666667 3.83333333333333
## 1 7 3 7 9 6 6
## 2 46 6 5 0 0 0
## 3 0 0 0 0 0 0
##
## 4 4.16666666666667 4.33333333333333 4.66666666666667 4.83333333333333 5
## 1 19 1 2 1 1 5
## 2 0 0 0 0 0 0
## 3 0 0 0 0 0 0
table(df$kgroup, df$need)
##
## 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5
## 1 0 0 0 0 0 0 0 0 0 2 5 14 30 5 2 8 9
## 2 0 0 0 0 8 8 15 16 58 22 9 11 13 1 1 0 1
## 3 88 13 14 16 24 6 1 0 0 0 0 0 0 0 0 0 0
# kgroup3: effect low, need low
# kgroup2: effect mid, need mid
# kgroup1: effect high, need high
df$set1 <- 4
df$set1[df$v21 == "A유형"] <- 1
df$set1[df$v21 == "B유형"] <- 2
df$set1[df$v21 == "C유형"] <- 3
df$set2 <- 4
df$set2[df$v22 == "A유형"] <- 1
df$set2[df$v22 == "B유형"] <- 2
df$set2[df$v22 == "C유형"] <- 3
df$set3 <- 4
df$set3[df$v23 == "A유형"] <- 1
df$set3[df$v23 == "B유형"] <- 2
df$set3[df$v23 == "C유형"] <- 3
df$set4 <- 4
df$set4[df$v24 == "A유형"] <- 1
df$set4[df$v24 == "B유형"] <- 2
df$set4[df$v24 == "C유형"] <- 3
df$set5 <- 4
df$set5[df$v25 == "A유형"] <- 1
df$set5[df$v25 == "B유형"] <- 2
df$set5[df$v25 == "C유형"] <- 3
df$set6 <- 4
df$set6[df$v26 == "A유형"] <- 1
df$set6[df$v26 == "B유형"] <- 2
df$set6[df$v26 == "C유형"] <- 3
df$set7 <- 4
df$set7[df$v27 == "A유형"] <- 1
df$set7[df$v27 == "B유형"] <- 2
df$set7[df$v27 == "C유형"] <- 3
df$set8 <- 4
df$set8[df$v28 == "A유형"] <- 1
df$set8[df$v28 == "B유형"] <- 2
df$set8[df$v28 == "C유형"] <- 3
#### Transform response data into required format ####
df$respondentID <- 1:nrow(df)
min_frame <- df
r <- 8 # set number of choice sets
k <- 4 # set number of stimuli per choice set
minl_frame <- data.frame() # define empty dataframe
for (i in 1:nrow(min_frame)){ # build choice data frames for every respondent and append them.
tmp <- data.frame(respondentID = min_frame$respondentID[i],
choice = rep(1:r, each=k),
stimulus = rep(1:k, r)
)
tmp_choice <- rep(0, r*k)
# get indices: which
get_indices <- c(min_frame$set1[i],
min_frame$set2[i],
min_frame$set3[i],
min_frame$set4[i],
min_frame$set5[i],
min_frame$set6[i],
min_frame$set7[i],
min_frame$set8[i]
) + rep(k, r)*(0:(r-1)) # take into account that indices add up in steps of 4
tmp_choice[get_indices] <- 1
tmp <- data.frame(cbind(tmp, tmp_choice))
minl_frame <- rbind(minl_frame, tmp)
}
#### generate dummy variables ####
# length연장기간
fulldesign_long$length1 <- 0
fulldesign_long$length1[fulldesign_long$length == 1] <- 1
fulldesign_long$length2 <- 0
fulldesign_long$length2[fulldesign_long$length == 2] <- 1
fulldesign_long$length3 <- 0
fulldesign_long$length3[fulldesign_long$length == 3] <- 1
# wage 급여수준
fulldesign_long$wage1 <- 0
fulldesign_long$wage1[fulldesign_long$wage == 1] <- 1
fulldesign_long$wage2 <- 0
fulldesign_long$wage2[fulldesign_long$wage == 2] <- 1
fulldesign_long$wage3 <- 0
fulldesign_long$wage3[fulldesign_long$wage == 3] <- 1
# work 근로방법
fulldesign_long$work1 <- 0
fulldesign_long$work1[fulldesign_long$work == 1] <- 1
fulldesign_long$work2 <- 0
fulldesign_long$work2[fulldesign_long$work == 2] <- 1
fulldesign_long$work3 <- 0
fulldesign_long$work3[fulldesign_long$work == 3] <- 1
# none
fulldesign_long$none <- 0
fulldesign_long$none[fulldesign_long$stimulus == 4] <- 1
#### generate effect-coded variables ####
# length연장기간
fulldesign_long$lengthE2 <- 0
fulldesign_long$lengthE2[fulldesign_long$length == 1] <- -1
fulldesign_long$lengthE2[fulldesign_long$length == 2] <- 1
fulldesign_long$lengthE3 <- 0
fulldesign_long$lengthE3[fulldesign_long$length == 1] <- -1
fulldesign_long$lengthE3[fulldesign_long$length == 3] <- 1
# wage 급여수준
fulldesign_long$wageE2 <- 0
fulldesign_long$wageE2[fulldesign_long$wage == 1] <- -1
fulldesign_long$wageE2[fulldesign_long$wage == 2] <- 1
fulldesign_long$wageE3 <- 0
fulldesign_long$wageE3[fulldesign_long$wage == 1] <- -1
fulldesign_long$wageE3[fulldesign_long$wage == 3] <- 1
# work 근로방법
fulldesign_long$workE2 <- 0
fulldesign_long$workE2[fulldesign_long$work == 1] <- -1
fulldesign_long$workE2[fulldesign_long$work == 2] <- 1
fulldesign_long$workE3 <- 0
fulldesign_long$workE3[fulldesign_long$work == 1] <- -1
fulldesign_long$workE3[fulldesign_long$work == 3] <- 1
# none
fulldesign_long$noneE <- -1
fulldesign_long$noneE[fulldesign_long$stimulus == 4] <- 1
# match the reshaped data.frame with the choice designs
merged_data <- merge(minl_frame, fulldesign_long, by = c("choice", "stimulus"))
merged_data <- merged_data[order(merged_data$respondentID), ]
# prepare individual level variables to be merged with the choice data frame
tmp <- df[c("respondentID",
"peak",
"gender",
"affiliation",
"age",
"income",
"experiment",
"kgroup")] # generaged clustervariable
merged_data <- merge(merged_data, tmp, by = "respondentID") # merge individual-level variables with the choice data frame
# convert data for mlogit
cbc <- mlogit.data(merged_data, choice="tmp_choice", shape="long", alt.var="stimulus", id.var = "respondentID")
#### partworth model with dummy coding ####
ml1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc)
summary(ml1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc, method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29031 0.25500 0.24938 0.20531
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 4.64E-08
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.376320 0.058267 6.4586 1.057e-10 ***
## length3 0.663826 0.055910 11.8731 < 2.2e-16 ***
## wage2 -0.593390 0.046870 -12.6603 < 2.2e-16 ***
## wage3 -1.236786 0.059698 -20.7174 < 2.2e-16 ***
## work2 -0.188861 0.051181 -3.6901 0.0002242 ***
## work3 -0.457528 0.053572 -8.5404 < 2.2e-16 ***
## none -0.548724 0.066735 -8.2224 2.220e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -3991
# wordreg( ml1, file = "model 1.doc")
round(exp(coef(ml1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.457 1.942 0.552 0.290 0.828 0.633 0.578
tmp_ml <- ml1
p_pred_a <- exp( tmp_ml$coefficients[4] + # 60% 급여
tmp_ml$coefficients[6] ) / # 계약직
( exp( tmp_ml$coefficients[4] + # 60%급여
tmp_ml$coefficients[6] ) + # 계약직
exp( tmp_ml$coefficients[4]*0 + # 80%급여
tmp_ml$coefficients[6]*0 ) # 시간단축
)
p_pred_a # get predicted probability of choosing option A versus option B
## wage3
## 0.1552093
library(MASS)
##
## 다음의 패키지를 부착합니다: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
## The following object is masked from 'package:dfidx':
##
## select
m <- ml1
est_betas <- m$coefficients
sim_betas <- mvrnorm(1000, m$coefficients, vcov(m))
sim_ci <- data.frame()
sim_preds <- apply(sim_betas, 1, function(x) {
m$coefficients <- x
exp( m$coefficients[4] + # explainable AI
m$coefficients[6] ) / # costs2
( exp( m$coefficients[4] + # explainable AI
m$coefficients[6] ) + # costs 2
exp( m$coefficients[4]*0 + # NO explainable AI
m$coefficients[6]*0 ) # NO costs
)
})
sim_ci_a <- rbind(sim_ci, quantile(sim_preds, c(.025, .975)) )
colnames(sim_ci_a) <- c("lo", "hi")
sim_ci_a
# prepare results for plotting
modelframe <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( ml1$coefficients[1], ml1$coefficients[2],
ml1$coefficients[3], ml1$coefficients[4],
ml1$coefficients[5], ml1$coefficients[6],
ml1$coefficients[7]),
std.error = c( coef(summary(ml1))[1, "Std. Error"], coef(summary(ml1))[2, "Std. Error"],
coef(summary(ml1))[3, "Std. Error"], coef(summary(ml1))[4, "Std. Error"],
coef(summary(ml1))[5, "Std. Error"], coef(summary(ml1))[6, "Std. Error"],
coef(summary(ml1))[7, "Std. Error"]
)
)
# https://cran.r-project.org/web/packages/dotwhisker/vignettes/dotwhisker-vignette.html
dwplot(modelframe)
# define additional labels in plot (bracketing the feature levels)
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p1 <- {dwplot(modelframe,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1)) +
theme_bw() + scale_colour_grey() + theme(legend.position = "none") + xlab("Estimated partworth utilities") + ylab("") } %>% add_brackets(three_brackets)
p1
png("Fig_1.png", width=18, height=18,units="cm", res=400)
p1
pdf(paste0("Fig_1.pdf"), width=7, height=7)
p1
dev.off()
## png
## 2
#### Model with effects coding (shown in Annex A5 and A6) ####
ml1e <- mlogit(tmp_choice ~ lengthE2 + lengthE3 +
wageE2 + wageE3 +
workE2 + workE3 +
none | 0, cbc)
summary(ml1e)
##
## Call:
## mlogit(formula = tmp_choice ~ lengthE2 + lengthE3 + wageE2 +
## wageE3 + workE2 + workE3 + none | 0, data = cbc, method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29031 0.25500 0.24938 0.20531
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 4.64E-08
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## lengthE2 0.029605 0.030611 0.9671 0.3335
## lengthE3 0.317111 0.029110 10.8937 < 2.2e-16 ***
## wageE2 0.016669 0.031886 0.5228 0.6011
## wageE3 -0.626727 0.038372 -16.3330 < 2.2e-16 ***
## workE2 0.026602 0.031311 0.8496 0.3955
## workE3 -0.242065 0.032617 -7.4215 1.157e-13 ***
## none -0.069918 0.045667 -1.5310 0.1258
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -3991
# wordreg( ml1e, file = "model 1E.doc")
covMatrix <- vcov(ml1e)
sqrt(diag(covMatrix))
## lengthE2 lengthE3 wageE2 wageE3 workE2 workE3 none
## 0.03061134 0.02910965 0.03188605 0.03837193 0.03131058 0.03261671 0.04566664
modelframe <- data.frame(term = c(
'1yr',
'2yr',
'3yr',
'80%',
'70%',
'60%',
'reduction',
'transfer',
'part-time',
'No-choice option'),
estimate = c( -(ml1e$coefficients[1] + ml1e$coefficients[2]), ml1e$coefficients[1], ml1e$coefficients[2],
-(ml1e$coefficients[3] + ml1e$coefficients[4]), ml1e$coefficients[3], ml1e$coefficients[4],
-(ml1e$coefficients[5] + ml1e$coefficients[6]), ml1e$coefficients[5], ml1e$coefficients[6],
ml1e$coefficients[7] ),
std.error = c( sqrt(sum(covMatrix[1:2, 1:2])), coef(summary(ml1e))[1, "Std. Error"], coef(summary(ml1e))[2, "Std. Error"],
sqrt(sum(covMatrix[3:4, 3:4])), coef(summary(ml1e))[3, "Std. Error"], coef(summary(ml1e))[4, "Std. Error"],
sqrt(sum(covMatrix[5:6, 5:6])), coef(summary(ml1e))[5, "Std. Error"], coef(summary(ml1e))[6, "Std. Error"],
coef(summary(ml1e))[7, "Std. Error"]
)
)
modelframe # get estimates (for Annex A5)
dwplot(modelframe)
# define additional labels in plot (bracketing the feature levels)
three_brackets <- list(c("length",
"1yr",
"3yr"),
c("wage",
'80%',
'60%'),
c("work",
'reduction',
'part-time')
)
p1e <- {dwplot(modelframe,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1)) +
theme_bw() + scale_colour_grey() + theme(legend.position = "none") + xlab("Estimated partworth utilities") + ylab("") } %>% add_brackets(three_brackets)
p1e
png("Fig_1e.png", width=18, height=18,units="cm", res=400)
p1e
pdf(paste0("Fig_1e.pdf"), width=7, height=7)
p1e
dev.off()
## png
## 2
ml1g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$kgroup == 1, ])
summary(ml1g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$kgroup == 1, ],
## method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.31333 0.31167 0.31500 0.06000
##
## nr method
## 5 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.000115
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.437690 0.129445 3.3813 0.0007215 ***
## length3 0.952692 0.118585 8.0338 8.882e-16 ***
## wage2 -0.406232 0.101233 -4.0129 5.999e-05 ***
## wage3 -0.921885 0.120271 -7.6651 1.776e-14 ***
## work2 -0.028530 0.112559 -0.2535 0.7999066
## work3 -0.067461 0.113677 -0.5934 0.5528847
## none -1.511329 0.207903 -7.2694 3.610e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -676.18
ml1g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$kgroup == 2, ])
summary(ml1g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$kgroup == 2, ],
## method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.33052 0.28374 0.28298 0.10276
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 4.67E-07
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.444100 0.085249 5.2094 1.894e-07 ***
## length3 0.627066 0.084044 7.4612 8.571e-14 ***
## wage2 -0.584567 0.069957 -8.3561 < 2.2e-16 ***
## wage3 -1.169597 0.086657 -13.4969 < 2.2e-16 ***
## work2 -0.244641 0.075023 -3.2609 0.001111 **
## work3 -0.601871 0.080559 -7.4711 7.949e-14 ***
## none -1.397780 0.117535 -11.8924 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1525.2
ml1g3 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$kgroup == 3, ])
summary(ml1g3)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$kgroup == 3, ],
## method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.23920 0.19985 0.18519 0.37577
##
## nr method
## 5 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.00105
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.235455 0.102764 2.2912 0.02195 *
## length3 0.501524 0.098107 5.1120 3.187e-07 ***
## wage2 -0.745044 0.082524 -9.0282 < 2.2e-16 ***
## wage3 -1.634117 0.117942 -13.8553 < 2.2e-16 ***
## work2 -0.202816 0.090820 -2.2332 0.02554 *
## work3 -0.537365 0.095018 -5.6554 1.555e-08 ***
## none 0.057760 0.103571 0.5577 0.57706
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1563.1
# Prepare results for plotting group 1
tmp <- ml1g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.1294455 0.1185849 0.1012326 0.1202706 0.1125585 0.1136765 0.2079033
modelframe2a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe2a$term <- as.character(modelframe2a$term)
dwplot(modelframe2a)
# Prepare results for plotting group 2
tmp <- ml1g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.08524922 0.08404411 0.06995710 0.08665698 0.07502342 0.08055944 0.11753543
modelframe2b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe2b$term <- as.character(modelframe2b$term)
dwplot(modelframe2b)
# Prepare results for plotting group 3
tmp <- ml1g3
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.10276389 0.09810679 0.08252447 0.11794157 0.09082015 0.09501760 0.10357076
modelframe2c <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe2c$term <- as.character(modelframe2c$term)
dwplot(modelframe2c)
# combine results for all three groups into one data frame for plotting
modelframe_allg <- rbind(modelframe2a, modelframe2b, modelframe2c) # dataframe with outputs for all groups
modelframe_allg$model <- rep(c("Group 1", "Group 2", "Group 3"), each = 7)
modelframe_allg <- modelframe_allg[order(modelframe_allg$model, decreasing = T), ]
modelframe_allg$model <- factor (modelframe_allg$model,
levels = c("Group 1","Group 2","Group 3"),
labels = c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p2 <- {dwplot( modelframe_allg ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow"),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "Group", breaks=c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow") ) +
scale_shape_discrete(name = "Group", breaks = c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow") ) +
guides(
shape = guide_legend("Group"),
colour = guide_legend("Group"),
size = guide_legend("Group")
) } %>%
add_brackets(three_brackets)
p2
png("Fig_2.png", width=18, height=18,units="cm", res=400)
p2
pdf(paste0("Fig_2.pdf"), width=7, height=7)
p2
dev.off()
## png
## 2
round(exp(coef(ml1g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.549 2.593 0.666 0.398 0.972 0.935 0.221
round(exp(coef(ml1g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.559 1.872 0.557 0.310 0.783 0.548 0.247
round(exp(coef(ml1g3)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.265 1.651 0.475 0.195 0.816 0.584 1.059
table(df$gender)
##
## 남성 여성
## 308 92
ml2g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$gender == "남성", ])
summary(ml2g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$gender == "남성",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.28693 0.25122 0.24269 0.21916
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 1.14E-07
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.347450 0.066948 5.1898 2.105e-07 ***
## length3 0.662683 0.063861 10.3770 < 2.2e-16 ***
## wage2 -0.588388 0.053801 -10.9364 < 2.2e-16 ***
## wage3 -1.227969 0.068507 -17.9248 < 2.2e-16 ***
## work2 -0.179668 0.058811 -3.0550 0.002251 **
## work3 -0.439356 0.061377 -7.1583 8.167e-13 ***
## none -0.465283 0.075568 -6.1572 7.405e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -3090.6
ml2g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$gender == "여성", ])
summary(ml2g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$gender == "여성",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.30163 0.26766 0.27174 0.15897
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.00425
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.465919 0.118515 3.9313 8.448e-05 ***
## length3 0.667580 0.115778 5.7660 8.116e-09 ***
## wage2 -0.610465 0.095616 -6.3846 1.719e-10 ***
## wage3 -1.266244 0.121800 -10.3961 < 2.2e-16 ***
## work2 -0.217646 0.104077 -2.0912 0.03651 *
## work3 -0.515823 0.109930 -4.6923 2.702e-06 ***
## none -0.863150 0.144104 -5.9898 2.102e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -892.94
# Prepare results for plotting group 1
tmp <- ml2g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.06694830 0.06386074 0.05380072 0.06850672 0.05881146 0.06137699 0.07556754
modelframe3a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe3a$term <- as.character(modelframe3a$term)
dwplot(modelframe3a)
# Prepare results for plotting group 2
tmp <- ml2g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.11851490 0.11577823 0.09561584 0.12179964 0.10407750 0.10992980 0.14410440
modelframe3b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe3b$term <- as.character(modelframe3b$term)
dwplot(modelframe3b)
# combine results for all three groups into one data frame for plotting
modelframe_allg2 <- rbind(modelframe3a, modelframe3b) # dataframe with outputs for all groups
modelframe_allg2$model <- rep(c("male", "female"), each = 7)
modelframe_allg2 <- modelframe_allg2[order(modelframe_allg2$model, decreasing = T), ]
modelframe_allg2$model <- factor (modelframe_allg2$model,
levels = c("male", "female"),
labels = c("male", "female"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p3 <- {dwplot( modelframe_allg2 ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c(
"male", "female"
),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "Gender", breaks=c("male", "female") ) +
scale_shape_discrete(name = "Gender", breaks = c("male", "female") ) +
guides(
shape = guide_legend("Gender"),
colour = guide_legend("Gender"),
size = guide_legend("Gender")
) } %>%
add_brackets(three_brackets)
p3
png("Fig_3.png", width=18, height=18,units="cm", res=400)
p3
pdf(paste0("Fig_3.pdf"), width=7, height=7)
p3
dev.off()
## png
## 2
round(exp(coef(ml2g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.415 1.940 0.555 0.293 0.836 0.644 0.628
round(exp(coef(ml2g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.593 1.950 0.543 0.282 0.804 0.597 0.422
table(df$affiliation)
##
## 공기업 사기업
## 200 200
ml4g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$affiliation == "사기업", ])
summary(ml4g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$affiliation == "사기업",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29812 0.27563 0.25562 0.17062
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 2.46E-08
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.479732 0.081593 5.8796 4.114e-09 ***
## length3 0.687576 0.079725 8.6244 < 2.2e-16 ***
## wage2 -0.689575 0.065888 -10.4658 < 2.2e-16 ***
## wage3 -1.320166 0.083529 -15.8050 < 2.2e-16 ***
## work2 -0.250940 0.072219 -3.4747 0.0005114 ***
## work3 -0.474703 0.074732 -6.3521 2.124e-10 ***
## none -0.796195 0.096790 -8.2260 2.220e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1937.7
ml4g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$affiliation == "공기업", ])
summary(ml4g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$affiliation == "공기업",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.28250 0.23438 0.24313 0.24000
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 3.83E-07
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.268009 0.083624 3.2049 0.0013509 **
## length3 0.643319 0.078607 8.1839 2.220e-16 ***
## wage2 -0.492007 0.066977 -7.3459 2.043e-13 ***
## wage3 -1.148013 0.085567 -13.4166 < 2.2e-16 ***
## work2 -0.127072 0.072859 -1.7441 0.0811458 .
## work3 -0.442535 0.077163 -5.7351 9.748e-09 ***
## none -0.322347 0.093111 -3.4620 0.0005363 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -2035.2
# Prepare results for plotting group 1
tmp <- ml4g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.08159318 0.07972498 0.06588812 0.08352859 0.07221886 0.07473156 0.09678964
modelframe4a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe4a$term <- as.character(modelframe4a$term)
dwplot(modelframe4a)
# Prepare results for plotting group 2
tmp <- ml4g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.08362370 0.07860744 0.06697704 0.08556688 0.07285906 0.07716329 0.09311143
modelframe4b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe4b$term <- as.character(modelframe4b$term)
dwplot(modelframe4b)
# combine results for all three groups into one data frame for plotting
modelframe_allg4 <- rbind(modelframe4a, modelframe4b) # dataframe with outputs for all groups
modelframe_allg4$model <- rep(c("private", "public"), each = 7)
modelframe_allg4 <- modelframe_allg4[order(modelframe_allg4$model, decreasing = T), ]
modelframe_allg4$model <- factor (modelframe_allg4$model,
levels = c("private", "public"),
labels = c("private", "public"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p4 <- {dwplot( modelframe_allg4 ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c(
"private", "public"
),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "Affiliation", breaks=c("private", "public") ) +
scale_shape_discrete(name = "Affiliation", breaks = c("private", "public") ) +
guides(
shape = guide_legend("Affiliation"),
colour = guide_legend("Affiliation"),
size = guide_legend("Affiliation")
) } %>%
add_brackets(three_brackets)
p4
png("Fig_4.png", width=18, height=18,units="cm", res=400)
p4
pdf(paste0("Fig_4.pdf"), width=7, height=7)
p4
dev.off()
## png
## 2
round(exp(coef(ml4g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.616 1.989 0.502 0.267 0.778 0.622 0.451
round(exp(coef(ml4g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.307 1.903 0.611 0.317 0.881 0.642 0.724
table(df$peak)
##
## 0 1
## 83 317
ml5g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$peak == 1, ])
summary(ml5g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$peak == 1, ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.28785 0.25237 0.24448 0.21530
##
## nr method
## 4 iterations, 0h:0m:1s
## g'(-H)^-1g = 9.51E-07
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.340775 0.066309 5.1392 2.759e-07 ***
## length3 0.622835 0.063746 9.7705 < 2.2e-16 ***
## wage2 -0.677929 0.053153 -12.7544 < 2.2e-16 ***
## wage3 -1.415175 0.070613 -20.0413 < 2.2e-16 ***
## work2 -0.186414 0.058397 -3.1922 0.001412 **
## work3 -0.511131 0.061323 -8.3350 < 2.2e-16 ***
## none -0.583400 0.074417 -7.8396 4.441e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -3116
ml5g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$peak == 0, ])
summary(ml5g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$peak == 0, ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29970 0.26506 0.26807 0.16717
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.000139
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.49670 0.12378 4.0129 5.999e-05 ***
## length3 0.80459 0.11793 6.8225 8.944e-12 ***
## wage2 -0.28297 0.10101 -2.8013 0.005090 **
## wage3 -0.68164 0.11562 -5.8955 3.735e-09 ***
## work2 -0.18719 0.10770 -1.7380 0.082209 .
## work3 -0.29188 0.11168 -2.6136 0.008958 **
## none -0.44964 0.15383 -2.9229 0.003468 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -853.12
# Prepare results for plotting group 1
tmp <- ml5g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.06630884 0.06374637 0.05315261 0.07061296 0.05839707 0.06132309 0.07441721
modelframe5a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe5a$term <- as.character(modelframe5a$term)
dwplot(modelframe5a)
# Prepare results for plotting group 2
tmp <- ml5g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.1237765 0.1179318 0.1010146 0.1156203 0.1077017 0.1116774 0.1538349
modelframe5b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe5b$term <- as.character(modelframe5b$term)
dwplot(modelframe5b)
# combine results for all three groups into one data frame for plotting
modelframe_allg5 <- rbind(modelframe5a, modelframe5b) # dataframe with outputs for all groups
modelframe_allg5$model <- rep(c("Yes", "No"), each = 7)
modelframe_allg5 <- modelframe_allg5[order(modelframe_allg5$model, decreasing = T), ]
modelframe_allg5$model <- factor (modelframe_allg5$model,
levels = c("Yes", "No"),
labels = c("Yes", "No"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p5 <- {dwplot( modelframe_allg5 ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c(
"Yes", "No"
),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "peak", breaks=c("Yes", "No") ) +
scale_shape_discrete(name = "peak", breaks = c("Yes", "No") ) +
guides(
shape = guide_legend("peak"),
colour = guide_legend("peak"),
size = guide_legend("peak")
) } %>%
add_brackets(three_brackets)
p5
png("Fig_5.png", width=18, height=18,units="cm", res=400)
p5
pdf(paste0("Fig_5.pdf"), width=7, height=7)
p5
dev.off()
## png
## 2
round(exp(coef(ml5g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.406 1.864 0.508 0.243 0.830 0.600 0.558
round(exp(coef(ml5g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.643 2.236 0.754 0.506 0.829 0.747 0.638
table(df$age)
##
## 장년 중년 청년
## 151 118 131
ml6g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$age == "청년", ])
summary(ml6g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$age == "청년",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29580 0.28721 0.25477 0.16221
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 5.83E-08
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.468677 0.101733 4.6069 4.087e-06 ***
## length3 0.767030 0.097633 7.8562 3.997e-15 ***
## wage2 -0.548681 0.079298 -6.9192 4.542e-12 ***
## wage3 -1.364954 0.106776 -12.7833 < 2.2e-16 ***
## work2 -0.224265 0.088483 -2.5345 0.01126 *
## work3 -0.467895 0.092402 -5.0637 4.113e-07 ***
## none -0.778378 0.121528 -6.4050 1.504e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1259.6
ml6g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$age == "중년", ])
summary(ml6g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$age == "중년",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.30085 0.25106 0.27542 0.17267
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.00165
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.356012 0.103055 3.4546 0.0005511 ***
## length3 0.597730 0.100440 5.9511 2.663e-09 ***
## wage2 -0.504397 0.084680 -5.9565 2.577e-09 ***
## wage3 -1.088724 0.105252 -10.3440 < 2.2e-16 ***
## work2 -0.192344 0.089557 -2.1477 0.0317363 *
## work3 -0.644974 0.098756 -6.5310 6.534e-11 ***
## none -0.795075 0.124482 -6.3871 1.691e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1174.7
ml6g3 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$age == "장년", ])
summary(ml6g3)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$age == "장년",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.27732 0.23013 0.22434 0.26821
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 3.39E-07
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.308557 0.098643 3.1280 0.001760 **
## length3 0.630213 0.093568 6.7354 1.635e-11 ***
## wage2 -0.722731 0.080645 -8.9619 < 2.2e-16 ***
## wage3 -1.255726 0.099362 -12.6379 < 2.2e-16 ***
## work2 -0.151152 0.088332 -1.7112 0.087047 .
## work3 -0.290955 0.088995 -3.2693 0.001078 **
## none -0.217729 0.106664 -2.0413 0.041224 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1523.3
# Prepare results for plotting group 1
tmp <- ml6g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.10173323 0.09763328 0.07929824 0.10677608 0.08848310 0.09240248 0.12152756
modelframe6a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe6a$term <- as.character(modelframe6a$term)
dwplot(modelframe6a)
# Prepare results for plotting group 2
tmp <- ml6g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.10305492 0.10043972 0.08467987 0.10525194 0.08955721 0.09875592 0.12448191
modelframe6b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe6b$term <- as.character(modelframe6b$term)
dwplot(modelframe6b)
# Prepare results for plotting group 3
tmp <- ml6g3
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.09864320 0.09356754 0.08064466 0.09936181 0.08833159 0.08899484 0.10666361
modelframe6c <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe6c$term <- as.character(modelframe6c$term)
dwplot(modelframe6c)
# combine results for all three groups into one data frame for plotting
modelframe_allg6 <- rbind(modelframe6a, modelframe6b, modelframe6c) # dataframe with outputs for all groups
modelframe_allg6$model <- rep(c("2030s","40s","50s"), each = 7)
modelframe_allg6 <- modelframe_allg6[order(modelframe_allg6$model, decreasing = T), ]
modelframe_allg6$model <- factor (modelframe_allg6$model,
levels = c("2030s","40s","50s"),
labels = c("2030s","40s","50s"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p6 <- {dwplot( modelframe_allg6 ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c(
"2030s","40s","50s"
),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "Age", breaks=c("2030s","40s","50s") ) +
scale_shape_discrete(name = "Age", breaks = c("2030s","40s","50s") ) +
guides(
shape = guide_legend("Age"),
colour = guide_legend("Age"),
size = guide_legend("Age")
) } %>%
add_brackets(three_brackets)
p6
png("Fig_6.png", width=18, height=18,units="cm", res=400)
p6
pdf(paste0("Fig_6.pdf"), width=7, height=7)
p6
dev.off()
## png
## 2
round(exp(coef(ml6g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.598 2.153 0.578 0.255 0.799 0.626 0.459
round(exp(coef(ml6g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.428 1.818 0.604 0.337 0.825 0.525 0.452
round(exp(coef(ml6g3)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.361 1.878 0.485 0.285 0.860 0.748 0.804
table(df$income)
##
## 평균 이상이다. 평균 이하이다. 평균이다.
## 95 65 240
ml7g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$income == "평균 이상이다.", ])
summary(ml7g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$income == "평균 이상이다.",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29474 0.27763 0.26184 0.16579
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.000421
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.429433 0.116359 3.6906 0.0002237 ***
## length3 0.748315 0.110188 6.7912 1.112e-11 ***
## wage2 -0.433162 0.094563 -4.5807 4.634e-06 ***
## wage3 -0.880642 0.110766 -7.9505 1.776e-15 ***
## work2 0.042048 0.100871 0.4169 0.6767876
## work3 -0.250938 0.107125 -2.3425 0.0191560 *
## none -0.507194 0.144071 -3.5205 0.0004308 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -963.88
ml7g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$income == "평균이다.", ])
summary(ml7g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$income == "평균이다.",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.29583 0.25312 0.24531 0.20573
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 2.06E-07
## gradient close to zero
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.312277 0.074905 4.1690 3.060e-05 ***
## length3 0.542547 0.072369 7.4969 6.528e-14 ***
## wage2 -0.687096 0.060644 -11.3300 < 2.2e-16 ***
## wage3 -1.424996 0.080528 -17.6957 < 2.2e-16 ***
## work2 -0.233827 0.066922 -3.4940 0.0004758 ***
## work3 -0.505815 0.069435 -7.2847 3.222e-13 ***
## none -0.702651 0.085148 -8.2521 2.220e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -2357.3
ml7g3 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$income == "평균 이하이다.", ])
summary(ml7g3)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$income == "평균 이하이다.",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.26346 0.22885 0.24615 0.26154
##
## nr method
## 5 iterations, 0h:0m:0s
## g'(-H)^-1g = 3.78E-06
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.571379 0.158376 3.6077 0.0003089 ***
## length3 1.017881 0.151733 6.7084 1.968e-11 ***
## wage2 -0.484640 0.121240 -3.9974 6.405e-05 ***
## wage3 -1.168216 0.156129 -7.4824 7.305e-14 ***
## work2 -0.405462 0.132923 -3.0503 0.0022858 **
## work3 -0.625607 0.140617 -4.4490 8.627e-06 ***
## none -0.072418 0.169185 -0.4280 0.6686199
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -641.69
# Prepare results for plotting group 1
tmp <- ml6g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.10173323 0.09763328 0.07929824 0.10677608 0.08848310 0.09240248 0.12152756
modelframe7a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe7a$term <- as.character(modelframe7a$term)
dwplot(modelframe7a)
# Prepare results for plotting group 2
tmp <- ml7g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.07490525 0.07236942 0.06064414 0.08052780 0.06692240 0.06943479 0.08514776
modelframe7b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe7b$term <- as.character(modelframe7b$term)
dwplot(modelframe7b)
# Prepare results for plotting group 3
tmp <- ml7g3
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.1583763 0.1517333 0.1212402 0.1561294 0.1329234 0.1406172 0.1691848
modelframe7c <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe7c$term <- as.character(modelframe7c$term)
dwplot(modelframe7c)
# combine results for all three groups into one data frame for plotting
modelframe_allg7 <- rbind(modelframe7a, modelframe7b, modelframe7c) # dataframe with outputs for all groups
modelframe_allg7$model <- rep(c("High","Middle","Low"), each = 7)
modelframe_allg7 <- modelframe_allg7[order(modelframe_allg7$model, decreasing = T), ]
modelframe_allg7$model <- factor (modelframe_allg7$model,
levels = c("High","Middle","Low"),
labels = c("High","Middle","Low"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p7 <- {dwplot( modelframe_allg7 ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c(
"High","Middle","Low"
),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "Income", breaks=c("High","Middle","Low") ) +
scale_shape_discrete(name = "Income", breaks = c("High","Middle","Low") ) +
guides(
shape = guide_legend("Income"),
colour = guide_legend("Income"),
size = guide_legend("Income")
) } %>%
add_brackets(three_brackets)
p7
png("Fig_7.png", width=18, height=18,units="cm", res=400)
p7
pdf(paste0("Fig_7.pdf"), width=7, height=7)
p7
dev.off()
## png
## 2
round(exp(coef(ml7g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.536 2.113 0.648 0.415 1.043 0.778 0.602
round(exp(coef(ml7g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.367 1.720 0.503 0.241 0.791 0.603 0.495
round(exp(coef(ml7g3)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.771 2.767 0.616 0.311 0.667 0.535 0.930
table(df$experiment)
##
## 고용연장 일자리
## 224 176
ml8g1 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$experiment == "고용연장", ])
summary(ml8g1)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$experiment == "고용연장",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.28237 0.26395 0.27009 0.18359
##
## nr method
## 4 iterations, 0h:0m:0s
## g'(-H)^-1g = 0.00456
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.431068 0.077499 5.5622 2.663e-08 ***
## length3 0.757893 0.073514 10.3094 < 2.2e-16 ***
## wage2 -0.504819 0.061734 -8.1773 2.220e-16 ***
## wage3 -1.097372 0.076422 -14.3593 < 2.2e-16 ***
## work2 -0.094517 0.067427 -1.4018 0.161
## work3 -0.314867 0.070282 -4.4800 7.463e-06 ***
## none -0.506352 0.091810 -5.5152 3.483e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -2245.9
ml8g2 <- mlogit(tmp_choice ~ length2 + length3 +
wage2 + wage3 +
work2 + work3 +
none | 0, cbc[cbc$experiment == "일자리", ])
summary(ml8g2)
##
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 +
## work2 + work3 + none | 0, data = cbc[cbc$experiment == "일자리",
## ], method = "nr")
##
## Frequencies of alternatives:choice
## 1 2 3 4
## 0.30043 0.24361 0.22301 0.23295
##
## nr method
## 5 iterations, 0h:0m:0s
## g'(-H)^-1g = 2.36E-06
## successive function values within tolerance limits
##
## Coefficients :
## Estimate Std. Error z-value Pr(>|z|)
## length2 0.304976 0.089007 3.4264 0.0006115 ***
## length3 0.531279 0.086745 6.1246 9.089e-10 ***
## wage2 -0.716891 0.072555 -9.8807 < 2.2e-16 ***
## wage3 -1.449748 0.096654 -14.9993 < 2.2e-16 ***
## work2 -0.312846 0.079214 -3.9494 7.836e-05 ***
## work3 -0.656698 0.083580 -7.8571 3.997e-15 ***
## none -0.626759 0.098068 -6.3911 1.647e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -1725.8
# Prepare results for plotting group 1
tmp <- ml8g1
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.07749897 0.07351448 0.06173383 0.07642227 0.06742674 0.07028197 0.09180954
modelframe8a <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe8a$term <- as.character(modelframe8a$term)
dwplot(modelframe8a)
# Prepare results for plotting group 2
tmp <- ml8g2
covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
## length2 length3 wage2 wage3 work2 work3 none
## 0.08900669 0.08674453 0.07255476 0.09665411 0.07921413 0.08357968 0.09806804
modelframe8b <- data.frame(term = c(
'2yr',
'3yr',
'70%',
'60%',
'transfer',
'part_time',
'No-choice option'),
estimate = c( tmp$coefficients[1], tmp$coefficients[2],
tmp$coefficients[3], tmp$coefficients[4],
tmp$coefficients[5], tmp$coefficients[6],
tmp$coefficients[7]),
std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"],
coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"],
coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"],
coef(summary(tmp))[7, "Std. Error"]
)
)
modelframe8b$term <- as.character(modelframe8b$term)
dwplot(modelframe8b)
# combine results for all three groups into one data frame for plotting
modelframe_allg8 <- rbind(modelframe8a, modelframe8b) # dataframe with outputs for all groups
modelframe_allg8$model <- rep(c("Hire","Job"), each = 7)
modelframe_allg8 <- modelframe_allg8[order(modelframe_allg8$model, decreasing = T), ]
modelframe_allg8$model <- factor (modelframe_allg8$model,
levels = c("Hire","Job"),
labels = c("Hire","Job"))
three_brackets <- list(c("length\nRef: 1yr",
"2yr",
"3yr"),
c("wage\nRef: 80%",
'70%',
'60%'),
c("work\nRef: reduction",
'transfer',
'part-time')
)
p8 <- {dwplot( modelframe_allg8 ,
vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),
model_order = c(
"Hire","Job"
),
dot_args = list(aes(shape = model) , size = 2.5 ) ) +
theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +
theme(plot.title = element_text(face="bold"),
legend.position = "bottom",
legend.title = element_blank()) + scale_color_grey(name = "Experiment", breaks=c("Hire","Job") ) +
scale_shape_discrete(name = "Experiment", breaks = c("Hire","Job") ) +
guides(
shape = guide_legend("Experiment"),
colour = guide_legend("Experiment"),
size = guide_legend("Experiment")
) } %>%
add_brackets(three_brackets)
p8
png("Fig_8.png", width=18, height=18,units="cm", res=400)
p8
pdf(paste0("Fig_8.pdf"), width=7, height=7)
p8
dev.off()
## png
## 2
round(exp(coef(ml8g1)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.539 2.134 0.604 0.334 0.910 0.730 0.603
round(exp(coef(ml8g2)),3)
## length2 length3 wage2 wage3 work2 work3 none
## 1.357 1.701 0.488 0.235 0.731 0.519 0.534
노인일자리 사업의 필요성 및 효과성에 대한 실험연구: 정책 프레이밍과 연령대의 이원상호작용 효과(오주현 등, 2022)
psych::alpha(df[c("effect1",
"effect2",
"effect3",
"effect4",
"effect5",
"need1") ],
check.keys=TRUE)
##
## Reliability analysis
## Call: psych::alpha(x = df[c("effect1", "effect2", "effect3", "effect4",
## "effect5", "need1")], check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.96 0.96 0.96 0.81 26 0.003 2.3 0.99 0.8
##
## 95% confidence boundaries
## lower alpha upper
## Feldt 0.96 0.96 0.97
## Duhachek 0.96 0.96 0.97
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## effect1 0.95 0.95 0.95 0.81 21 0.0036 0.0014 0.80
## effect2 0.95 0.95 0.95 0.81 21 0.0036 0.0014 0.81
## effect3 0.95 0.95 0.95 0.80 20 0.0038 0.0011 0.80
## effect4 0.95 0.95 0.95 0.80 20 0.0038 0.0012 0.79
## effect5 0.96 0.96 0.96 0.82 23 0.0033 0.0015 0.81
## need1 0.96 0.96 0.95 0.82 23 0.0034 0.0018 0.81
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## effect1 400 0.92 0.92 0.91 0.89 2.3 1.1
## effect2 400 0.92 0.92 0.90 0.88 2.3 1.0
## effect3 400 0.93 0.93 0.92 0.90 2.3 1.1
## effect4 400 0.93 0.93 0.92 0.90 2.3 1.1
## effect5 400 0.89 0.89 0.86 0.85 2.2 1.1
## need1 400 0.90 0.90 0.87 0.86 2.3 1.1
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## effect1 0.29 0.27 0.30 0.12 0.02 0
## effect2 0.28 0.32 0.27 0.12 0.01 0
## effect3 0.28 0.28 0.30 0.11 0.02 0
## effect4 0.30 0.26 0.30 0.12 0.02 0
## effect5 0.31 0.28 0.28 0.11 0.02 0
## need1 0.30 0.24 0.29 0.14 0.03 0
# generate mean of all items
df$effect <- apply(df[c("effect1",
"effect2",
"effect3",
"effect4",
"effect5",
"need1") ], 1, mean, na.rm=TRUE)
hist(df$effect)
psych::alpha(df[c("need2",
"need3",
"need4",
"need5") ],
check.keys=TRUE)
##
## Reliability analysis
## Call: psych::alpha(x = df[c("need2", "need3", "need4", "need5")], check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.95 0.95 0.94 0.84 20 0.0039 2.6 1.2 0.83
##
## 95% confidence boundaries
## lower alpha upper
## Feldt 0.94 0.95 0.96
## Duhachek 0.95 0.95 0.96
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## need2 0.93 0.93 0.90 0.82 14 0.0059 0.00069 0.82
## need3 0.93 0.93 0.90 0.82 14 0.0060 0.00090 0.81
## need4 0.94 0.94 0.92 0.84 16 0.0053 0.00186 0.82
## need5 0.95 0.95 0.93 0.86 19 0.0044 0.00049 0.85
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## need2 400 0.95 0.95 0.93 0.91 2.6 1.3
## need3 400 0.95 0.95 0.93 0.91 2.6 1.3
## need4 400 0.93 0.93 0.90 0.88 2.6 1.3
## need5 400 0.91 0.91 0.87 0.85 2.6 1.2
##
## Non missing response frequency for each item
## 1 2 3 4 5 miss
## need2 0.29 0.18 0.25 0.23 0.05 0
## need3 0.30 0.17 0.25 0.22 0.06 0
## need4 0.30 0.17 0.26 0.20 0.07 0
## need5 0.28 0.19 0.26 0.22 0.05 0
df$need <- apply(df[c("need2",
"need3",
"need4",
"need5") ], 1, mean, na.rm=TRUE)
hist(df$need)
library(gmodels)
CrossTable(df$gender)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 남성 | 여성 |
## |-----------|-----------|
## | 308 | 92 |
## | 0.770 | 0.230 |
## |-----------|-----------|
##
##
##
##
CrossTable(df$affiliation)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 공기업 | 사기업 |
## |-----------|-----------|
## | 200 | 200 |
## | 0.500 | 0.500 |
## |-----------|-----------|
##
##
##
##
CrossTable(df$age)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 장년 | 중년 | 청년 |
## |-----------|-----------|-----------|
## | 151 | 118 | 131 |
## | 0.378 | 0.295 | 0.328 |
## |-----------|-----------|-----------|
##
##
##
##
CrossTable(df$income)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 평균 이상이다. | 평균 이하이다. | 평균이다. |
## |-----------|-----------|-----------|
## | 95 | 65 | 240 |
## | 0.237 | 0.163 | 0.600 |
## |-----------|-----------|-----------|
##
##
##
##
CrossTable(df$peak)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 0 | 1 |
## |-----------|-----------|
## | 83 | 317 |
## | 0.207 | 0.792 |
## |-----------|-----------|
##
##
##
##
CrossTable(df$kgroup)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 1 | 2 | 3 |
## |-----------|-----------|-----------|
## | 75 | 163 | 162 |
## | 0.188 | 0.407 | 0.405 |
## |-----------|-----------|-----------|
##
##
##
##
CrossTable(df$experiment)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 400
##
##
## | 고용연장 | 일자리 |
## |-----------|-----------|
## | 224 | 176 |
## | 0.560 | 0.440 |
## |-----------|-----------|
##
##
##
##
## 3.1. 정책효과성
df %>%
group_by(gender) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
t.test(df$effect ~ df$gender)
##
## Welch Two Sample t-test
##
## data: df$effect by df$gender
## t = -3.3624, df = 152.78, p-value = 0.0009766
## alternative hypothesis: true difference in means between group 남성 and group 여성 is not equal to 0
## 95 percent confidence interval:
## -0.6088576 -0.1581760
## sample estimates:
## mean in group 남성 mean in group 여성
## 2.208874 2.592391
df %>%
group_by(affiliation) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
res.aov <- aov(effect ~ affiliation, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## affiliation 1 5.3 5.252 5.466 0.0199 *
## Residuals 398 382.4 0.961
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df %>%
group_by(age) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
res.aov <- aov(effect ~ age, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## age 2 15.8 7.906 8.44 0.000257 ***
## Residuals 397 371.9 0.937
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df %>%
group_by(income) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
res.aov <- aov(effect ~ income, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## income 2 0.4 0.2159 0.221 0.802
## Residuals 397 387.2 0.9754
df %>%
group_by(peak) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
t.test(df$effect ~ df$peak)
##
## Welch Two Sample t-test
##
## data: df$effect by df$peak
## t = 2.1049, df = 160.55, p-value = 0.03685
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.01363054 0.42751575
## sample estimates:
## mean in group 0 mean in group 1
## 2.471888 2.251314
df %>%
group_by(kgroup) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
res.aov <- aov(effect ~ kgroup, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## kgroup 1 275.6 275.65 979.4 <2e-16 ***
## Residuals 398 112.0 0.28
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df %>%
group_by(experiment) %>%
summarise(count = n(),
mean = mean(effect),
sd=sd(effect))
t.test(df$effect ~ df$experiment)
##
## Welch Two Sample t-test
##
## data: df$effect by df$experiment
## t = 2.5658, df = 383.81, p-value = 0.01067
## alternative hypothesis: true difference in means between group 고용연장 and group 일자리 is not equal to 0
## 95 percent confidence interval:
## 0.05877266 0.44420353
## sample estimates:
## mean in group 고용연장 mean in group 일자리
## 2.407738 2.156250
df %>%
group_by(gender) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
t.test(df$need ~ df$gender)
##
## Welch Two Sample t-test
##
## data: df$need by df$gender
## t = -1.6426, df = 164.8, p-value = 0.1024
## alternative hypothesis: true difference in means between group 남성 and group 여성 is not equal to 0
## 95 percent confidence interval:
## -0.4788555 0.0439317
## sample estimates:
## mean in group 남성 mean in group 여성
## 2.516234 2.733696
df %>%
group_by(affiliation) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
res.aov <- aov(need ~ affiliation, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## affiliation 1 18.3 18.276 13.41 0.000284 ***
## Residuals 398 542.3 1.363
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df %>%
group_by(age) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
res.aov <- aov(need ~ age, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## age 2 14.3 7.150 5.196 0.00592 **
## Residuals 397 546.3 1.376
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df %>%
group_by(income) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
res.aov <- aov(need ~ income, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## income 2 1.3 0.6458 0.458 0.633
## Residuals 397 559.3 1.4089
df %>%
group_by(peak) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
t.test(df$need ~ df$peak)
##
## Welch Two Sample t-test
##
## data: df$need by df$peak
## t = 2.7024, df = 137.74, p-value = 0.007751
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.09993683 0.64503676
## sample estimates:
## mean in group 0 mean in group 1
## 2.861446 2.488959
df %>%
group_by(kgroup) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
res.aov <- aov(need ~ kgroup, data = df)
summary(res.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## kgroup 1 457.2 457.2 1760 <2e-16 ***
## Residuals 398 103.4 0.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df %>%
group_by(experiment) %>%
summarise(count = n(),
mean = mean(need),
sd=sd(need))
t.test(df$need ~ df$experiment)
##
## Welch Two Sample t-test
##
## data: df$need by df$experiment
## t = 1.5565, df = 384.76, p-value = 0.1204
## alternative hypothesis: true difference in means between group 고용연장 and group 일자리 is not equal to 0
## 95 percent confidence interval:
## -0.04849415 0.41700064
## sample estimates:
## mean in group 고용연장 mean in group 일자리
## 2.647321 2.463068
table(df$affiliation, df$experiment)
##
## 고용연장 일자리
## 공기업 124 76
## 사기업 100 100
# mean by group
aggregate(need ~ affiliation,
data = df,
FUN = mean
)
aggregate(need ~ affiliation + experiment,
data = df,
FUN = mean
)
# mean by group
aggregate(effect ~ affiliation,
data = df,
FUN = mean
)
library(dplyr)
group_by(df, experiment, affiliation) %>%
summarise(
mean = mean(need, na.rm = TRUE),
sd = sd(need, na.rm = TRUE),
)
## `summarise()` has grouped output by 'experiment'. You can override using the
## `.groups` argument.
library(ggplot2)
ggplot(df) +
aes(x = experiment, y = need, fill = affiliation) +
geom_boxplot()
# Two-way ANOVA with interaction
# save model
mod <- aov(need ~ experiment * affiliation,
data = df
)
# print results
summary(mod)
## Df Sum Sq Mean Sq F value Pr(>F)
## experiment 1 3.3 3.346 2.521 0.11311
## affiliation 1 20.5 20.515 15.459 9.95e-05 ***
## experiment:affiliation 1 11.2 11.236 8.467 0.00382 **
## Residuals 396 525.5 1.327
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# method 2
library(ggpubr)
ggline(subset(df, !is.na(experiment)), # remove NA level for sex
x = "experiment",
y = "need",
line = "affiliation",
add = c("mean_se") # add mean and standard error
) +
labs(y = "NEED")
ggsave("need.png")
## Saving 7 x 5 in image
with(
df,
interaction.plot(experiment, affiliation, need)
)
mod2 <- lm(need ~ experiment + affiliation,
data = df
)
summary(mod2)
##
## Call:
## lm.default(formula = need ~ experiment + affiliation, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8999 -1.1685 0.1001 0.8501 2.7962
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.44363 0.09365 26.093 < 2e-16 ***
## experiment일자리 -0.23981 0.11799 -2.032 0.042773 *
## affiliation사기업 0.45628 0.11714 3.895 0.000115 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.163 on 397 degrees of freedom
## Multiple R-squared: 0.04256, Adjusted R-squared: 0.03774
## F-statistic: 8.824 on 2 and 397 DF, p-value: 0.000178
mod3 <- lm(effect ~ experiment + affiliation,
data = df
)
summary(mod3)
##
## Call:
## lm.default(formula = effect ~ experiment + affiliation, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.55343 -0.93657 -0.05343 0.70976 2.73010
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.29024 0.07824 29.273 < 2e-16 ***
## experiment일자리 -0.28353 0.09857 -2.876 0.00424 **
## affiliation사기업 0.26319 0.09786 2.690 0.00746 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9714 on 397 degrees of freedom
## Multiple R-squared: 0.03369, Adjusted R-squared: 0.02882
## F-statistic: 6.92 on 2 and 397 DF, p-value: 0.001112
# Two-way ANOVA with interaction
# save model
mod4<- aov(effect ~ experiment * affiliation,
data = df
)
# print results
summary(mod4)
## Df Sum Sq Mean Sq F value Pr(>F)
## experiment 1 6.2 6.234 6.63 0.01039 *
## affiliation 1 6.8 6.826 7.26 0.00735 **
## experiment:affiliation 1 2.3 2.313 2.46 0.11759
## Residuals 396 372.3 0.940
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# method 2
library(ggpubr)
ggline(subset(df, !is.na(experiment)), # remove NA level for sex
x = "experiment",
y = "effect",
line = "affiliation",
add = c("mean_se") # add mean and standard error
) +
labs(y = "Effect")
ggsave("effect.png")
## Saving 7 x 5 in image
library(ggplot2)
ggplot(df) +
aes(x = experiment, y = effect, fill = affiliation) +
geom_boxplot()