rm(list=ls())
gc()
##          used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 514580 27.5    1148107 61.4   644242 34.5
## Vcells 920313  7.1    8388608 64.0  1637054 12.5
library(conjoint)
library(skpr)
## 필요한 패키지를 로딩중입니다: shiny
library(AlgDesign)
library(mlogit)
## 필요한 패키지를 로딩중입니다: dfidx
## 
## 다음의 패키지를 부착합니다: 'dfidx'
## The following object is masked from 'package:stats':
## 
##     filter
library(texreg)
## Version:  1.38.6
## Date:     2022-04-06
## Author:   Philip Leifeld (University of Essex)
## 
## Consider submitting praise using the praise or praise_interactive functions.
## Please cite the JSS article in your publications -- see citation("texreg").
library(dotwhisker)
## 필요한 패키지를 로딩중입니다: ggplot2
library(ggplot2)
library(jtools)
library(grid)
library(gridExtra)
library(dplyr)
## 
## 다음의 패키지를 부착합니다: 'dplyr'
## The following object is masked from 'package:gridExtra':
## 
##     combine
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

1. 개요

1.1. 컨조인트분석 정의

컨조인트 분석(Conjoint Analysis)은 실험설계에 의해 구성된 다속성자 극물(multiattribute stimuli)1)에 대한 소비자의 선호를 수리적으로 분석하여 어떤 제품이 갖고 있는 속성(attribute) 하나 하나에 고객이 부여하는 효용(utility)을 추정함으로써, 그 고객이 선택할 제품을 예측하기 위한 기법이다. Hair, Anderson, Tatham and Black(1995)에 의하면 컨조인트 분석은 응답자가 재화 또는 서비스에 대한 선호를 어떻게 나타내는지를 이해하기 위해 특별하게 사용된 다변량기법(multivariate technique)이라고 정의한다. 즉 소비자가 원하는 속성을 분석하여 상품이나 서비스의 속성 을 최적으로 구성하는데 이용될 수 있는 방법이다.

그러므로 컨조인트 분석은 특정대상의 속성들에 대한 평가를 통해 이들이 어느 정도 공헌하고 있는지 분석하는 것이다. 이 분석은 각각의 독립변수(각 속성)가 전체의 종속변수(효용)에 가산적(additive)으로 공헌한다는 가정하에 분석된다.

1.2. 컨조인트 분석의 기본 절차

기본절차 속성과 속성수준 선정, 분석모델의 결정, 자료수집방법 선정, 자극물 제시 방법선택, 모수추정방법 선정, 결과 해석의 순서로 이루어진다. 다음에서 이를 구체적으로 살펴 보고자 한다.

(1) 속성과 속성수준 결정

컨조인트 분석을 실시하기 위하여 먼저 소비자가 제품을 구입할 때 고려하는 속성(독립변수)들이 무엇인지를 알아야 한다. 평가대상의 선호도나 효용을 측정하는데 필요한 속성들을 규명하기 위하여 주로 이용되는 방법은 평가자들에게 중요하게 생각하는 속성(독립변수)들이 무엇인지 직접 물어보는 방법이다. 이외에도 표적집단면접(FGI)이나 심층면접(Depth Interview)을 이용하거나 전문가들의 조언을 통하여 선정하는 방식을 취할 수 있다. 본고의 적용 예에서는 소비자들이 아직 인터넷 구매 경험이 부족하기 때문에 전자상거래 관련 전문가들의 판단을 통해 속성을 선정하 는 방법을 사용하였다.

이상과 같은 방법을 통하여 관련된 속성들을 선정하는데, 이 과정에서 가장 중요한 것은 분석의 효율성을 고려하여 속성의 수를 중요한 몇 개로 축소하는 것이다.

속성의 수와 명칭이 정해지고 나면 개별 속성들의 수준을 결정해야 한다. 속성수준을 결정할 때 고려해야 할 사항은 각 속성별 수준의 수와 각 수준간의 적정차이의 결정이다. 여기에서 각 속성 수준의 수에 따라 응답자가 비교 평가하여야 할 총 속성의 수가 정해짐을 고려해야 한다. 만약 세 가지의 속성이 선정되었을 경우 두 속성은 3개의 수준으로, 한가지 속성은 2개의 수준으로 나누었다면 응답자에게 제시되는 총자극의 수는 32×2=18개가 된다. 총자극의 수가 너무 많아지면 응답자가 응답에 곤란을 겪게 되고, 반대로 총자극의 수가 너무 적어지면 모델의 추정오차가 커지게 된다.

따라서 적정수준의 자극의 수를 사용하는 것이 바람직하다. 다음 으로 고려할 사항은 각 수준을 어떻게 나눌 것인가이다. 각 수준들이 응답자에게 거의 차이가 없으면 수준을 나누는 의미가 없어지기 때문에 가능한 한 수준을 나눌 때는 현재 출시되고 있는 제품들의 수준을 반영해주는 것이 바람직하다.

(2) 분석모델의 결정

두 번째 단계는 분석모델을 결정하는 것이다. 분석모델이란 선호도를 계산하는 방법으로 이상점모델(ideal point model), 벡터모델(vector model), 부분가치함수모델(part worth function model)이 있다. 각 모델들은 다음과 같이 설명될 수 있다. 예를 들어 다음과 같이 속성 \(p\)를 정의한다고 하자.

\[ p = 1, 2,.......................................,t\]

식 (1)을 선정된 \(t\)개 속성의 집합이라고 하고, \(y_{jp}\)\(j\)번째 자극에서 \(p\)번째 속성의 수준이라고 하자. 여기서 \(y_{jp}\)는 연속적(continuous) 변수일 수도 있고, 명목적(categorical) 변수일 수도 있다.

먼저 벡터모델(vector model)에서는 \(j\)번째 자극에 대한 선호도(\(S_j\))가 다음과 같이 나타난다.

\[S_j = \Sigma{W_p y_{jp}}\]

여기서 \(W_p\)\(t\) 속성에 대해 개인마다 다르게 부여하는 가중치이다. 따라서 벡터모델은 소비자의 선호도가 대상이 지니고 있는 각 속성수준과 이에 부여되는 가중치에 의해 결정된다고 보고 있다. 만약 벡터의 속성과 같이 속성에 대한 점수들이 무한히 커지면 선호도도 커지게되므로 이는 다 속성태도모델인 피시바인-로젠버그모델(Fishbein- Rosenberg Model)과 같다. 피시바인-로젠버그 모델에서도 각 대상들이 가지고 있는 속성의 양이 많으면 많을수록 대상에 대한 태도는 좋아진다.

이상점모델(ideal point model)은 선호도(\(S_j\))가 \(j\)번째 자극의 \(y_{jp}\)와 개인의 이상점인 \(x_P\) 사이의 거리의 제곱(\(d^2_j\))과 음(-)의 관계에 있다고 가정한 다. 여기서 \(d^2_j\)은 다음과 같이 정의한다.

\[d^2_j = \Sigma{W_p}(y_{jp} - X_p)^2 \] 따라서 이상점에 가까운(작은 \(d^2_j\)) 자극을 더 선호하게 된다.

부분가치함수모델(part worth function model)을 식으로 표현하면 다음과 같다.

\[S_j = \Sigma{f_p(y_{jp})}\] 여기서 \(f_p\)\(p\)번째 속성의 여러 수준에 대한 부분가치를 나타내는 함수이다. \(f_p(y_{jp})\)는 보통 3~4개의 \(y_{jp}\)를 이용하여 추정하는데, 각 속성수준(\(y_{jp}\))에 대한 부분가치는 선형보간법(linear interpolation)에 의해 얻어지므로 부분가치함수는 꺽인 선형함수의 형태를 띠게 된다. 추정범위 밖에 존재하는 \(y_{jp}\)에 대한 부분가치는 꺽인 선형함수의 외삽법(extrapolation)에 의해 구할 수는 있지만 타당성은 보장되지 않는다.

부분가치함수모델은 각 속성수준에 대한 선호함수가 다양한 형태를 가질 수 있으므로 상당한 융통성을 지닌 모델이다. 왜냐하면 \(f_p(y_{jp}) = w_p(y_{jp} - x_p)^2\)으로 정의하면 이상점 모델이 될 수도 있고, \(f_p(y_{jp}) = w_py_{jp}\)로 정의하면 벡터모델을 얻을 수도 있기 때문이다. 한편, 이상점모델은 벡터모델 보다 융통성이 있다. 왜냐하면 벡터모델은 이상점모델의 특수한 형태, 즉 \(x_p → \pm\infty\) 인 경우이기 때문이다.

부분가치함수모델은 융통성이 있기는 하지만 더 많은 모수를 추정해야 하는 부담(신뢰성이 낮아짐)과 선형보간법에 의해 중간값을 근사치로 구해야 한다는 단점이 있다. 즉 벡터모델은 추정해야 할 모수(\(w_p\))가 \(t\)개이고, 이상점모델은 \(w_p\)\(x_p\)가 있으므로 \(2t\)개인 반면에, 부분가치함수모델은 각 속성마다 \(q\)개의 수준이 있으면 \((q-1)t\)개의 모수를 추정해야 하는 것이다. 여기서 \((q-1)\)이 되는 이유는 \(f_p(y_{jp})\)를 모델을 변경시키지 않고 \(f_p(y_{jp}) = a_p\)로 나타낼 수 있으므로 첫 번째 수준에 대한 부분가치는 0의 값을 가지기 때문이다.

(3) 자료수집 방법선정 및 컨조인트 디자인

컨조인트 분석을 위해서는 자극(속성수준)을 이용하여 분석에 필요한 순위자료를 응답자로부터 얻어야 한다. 순위를 이용한 자료는 서열척도이며 응답자가 가장 좋아하는 것을 1, 그 다음 좋아하는 것을 2로 하여 제품의 선호도를 얻을 수 있다.

이 때 속성의 수나 속성수준이 많아지면 순위척도에 의한 응답이 현실적으로 불가능하므로, 자료수집의 어려움을 덜기 위한 방법으로서 다음의 몇가지 방법이 사용된다.

  • 첫째, 전체자극들을 우선 ‘가장 선호’, ‘중간선호’, ‘싫어하는 선호’ 순으로 분류한 후 각 분류집단내에서 자극들의 선호순위 를 정하고 각 집단간의 경계부분에 있는 자극들을 비교하여 순위를 확정하는 방법이다. 이 방법은 응답자의 비교범위를 축소시켜 주기 때문에 전체자극의 수가 많을 경우에 널리 사용되고 있다.

  • 둘째, 속성의 수가 많은 경우 전체 속성을 모두 고려하는 총체적 접근법(full profile approach)이 아닌 두 속성씩 비교하는 트레이드오프 제시법(trade-off method)을 사용하기도 한다. 이 방법은 응답자가 한번에 처리하여야 하는 정보의 양을 줄여주는 장점이 있으나, 전체적으로 평가하여야 하는 자극의 수는 많아 지며, 하나의 자극을 한번에 평가하지 않으므로 평가간에 모순 (intransitivity)이 생길 수 있다.

  • 셋째, 전체 대상간에 서열척도를 구하는 대신 각 자극들의 선호도를 등급법이나 리커트 형태의 척도에 의해 구한 후 이를 다시 서열척도로 변환하는 방법이다. 이 방법도 자주 사용되나 자극들간에 같은 응답이 많이 나오기 때문에 자극의 수가 많아지면 사용하기 불편하다.

  • 넷째, 직접 비교대상이 되는 자극의 수를 줄이는 방법이다. 이 방법은 각 속성단계들간의 상호작용을 최소화하면서 추정에 필요 한 자극의 수를 줄이는 방법으로 부분요인설계(fractional factorial design)를 하는 방법이다.

(4) 자극물 제시 방법 선택

다음 과제는 이러한 조합 또는 자극이 응답자에게 표현되는 방식을 선택하는 것이다. 자극을 응답자에게 제시하는 방법은 어구적 표현(verbal description), 산문적 표현(paragrape design), 회화적 표현(pictorial presentation)이 있다. 이외에도 PC를 이용하여 체계적으로 각 속성의 조합을 제시하는 방법도 있다. 사실 어떠한 방법이나 큰 차이가 없으나 연구의 취향이나 평가자의 수준에 따라 적절한 것을 사용하면 좋을것이다.

(5) 컨조인트 모수의 추정

컨조인트 모수추정방법은 척도의 종류에 따라 크게 2가지 범주로 나눌 수 있으며, 각 방법들은 다음과 같은 특징들이 있다.

  • 첫째, 자료가 넌메트릭(nonmetric)일 때 모수를 추정하는 대표적인 방법으로는 MONANOVA, PREMAP, LINMAP 등이 있으며, 이 방법들은 응답자가 선호도에 따라 각 프로필에 순위를 부여한 경우에 적합하다.

  • 둘째, 자료가 메트릭(metric)일 때 모수를 추정하는 대표적인 방법은 회귀분석이 다. 이 방법은 응답자가 선호도에 따라 각 프로필에 점수를 부여한 경우에 적합하다.

그러나 시뮬레이션을 이용한 여러 연구 결과에 따르면, 종속변수의 종류에 관계없이 어느 방법을 사용하더라도 예측타당성에는 거의 차이가 없는 것으로 밝혀졌다. 예를 들어, 종속변수가 순위인 경우에 회귀분석을 이용하여 계수를 추정하더라도 넌메트릭 자료를 이용하여 추정한 것과 거의 같은 예측 타당성을 보인다는 것이다(Green and Srinivasan 1978). 넌메트릭 자료들은 회귀분석에 비해서 널리 보급이 되어 있지 못하고 또 회귀분석에 비하여 계산속도가 느리기 때문에 많은 연구들이회귀분석을 이용하고 있다

넌메트릭 자료들 가운데에서는 LINMAP이 다른 방법들에 비하여 여러 가지 바람직한 특성을 가지고 있다. 우선, LINMAP은 선형계획법(linear programming)을 사용하고 있기 때문에, 대수학(calculus)에 의존하고 있는 다른 방법들과는 달리, 계수를 추정할 때 전체최적(global optimum)을 달성할 수 있다. 또한 LINMAP을 사용하면 부분가치함수에 각종 제약조건(constraint)들을 부과하여 벡터모델이나 이상점모델로 바꿔 주므로써, 추정되는 계수의 수를 줄여서 예측타당성을 향상시킬 수 있는 반면, 다른 방법들을 사용하면 이같은 제약조건들을 부가하는 것이 불가능하다. 특히 MONANOVA는 부분가치함수모델에만 모수추정이 가능한 방법이며, 요인설계(factorial design)에 의해 수집한 자료를 변환시키는 과정을 통해 모수를 추정한다.

(6) 결과 해석

자료분석을 통한 결과 해석의 주요 내용은 전형적으로 다음과 같은 것을 포함한다.

  • 첫째, 각 응답자에 대한 각 속성수준의 부분가치(효용) 추정,

  • 둘째, 인구통계적(demographic) 변수를 이용한시장세분(segments) 분,

  • 셋째, 속성의 상대적 중요도 결정,

  • 넷째, 초이스 시뮬레이션(choice simullation),

  • 다섯째, 소비자 만족도 향상을 위한 쇼핑모형을 찾아보는 것 등이다.

2.Candidate MrsKim

This DCE construction is described in the Theory Introduction to ExpertChoice vignette. It is intended to show more succinctly (than the silver example above) how to design such an experiment

2.1. step 0

library(ExpertChoice)
library(DoE.base)
## 필요한 패키지를 로딩중입니다: conf.design
## Registered S3 method overwritten by 'DoE.base':
##   method           from       
##   factorize.factor conf.design
## 
## 다음의 패키지를 부착합니다: 'DoE.base'
## The following objects are masked from 'package:stats':
## 
##     aov, lm
## The following object is masked from 'package:graphics':
## 
##     plot.design
## The following object is masked from 'package:base':
## 
##     lengths
#Step 0
# Described in Theory
kim <- list(
  length = c("1", "2", "3"),
  wage = c("1", "2", "3"),
  work = c("1", "2", "3")
)
kim
## $length
## [1] "1" "2" "3"
## 
## $wage
## [1] "1" "2" "3"
## 
## $work
## [1] "1" "2" "3"

2.2. step 1

# Step 1
kim_examp <- full_factorial(kim)
kim_examp

2.3. step 2

# Step 2
akim_examp <- augment_levels(kim_examp)
## Applying B mat
#> Applying B mat
#write.csv(ff_examp, "example.csv")

2.4. step 3

# Step 3
nlevels <- unlist(purrr::map(kim_examp, function(x){length(levels(x))}))
#oa_feasible(36, nlevels, strength = 3)

fractional_factorial_kim <- oa.design(nlevels = nlevels, columns = "min34")
# The fractional_factorial design is generated using the DoE.MIParray package.
# The following is the command to run this generation.
# The result is saved in the package.

2.5. step 4

# Step 4
# Confirming that this is an efficient design.
colnames(fractional_factorial_kim) <- colnames(kim_examp)
fractional_factorial_kim <- search_design(kim_examp, fractional_factorial_kim)

2.6. step 5

# Step 5.
# This table is reported as Table
# Confirm that this design supports all interactions.
row1_main_effects <- fractional_factorial_efficiency(~ length + wage + work, fractional_factorial_kim)
## Your fractional factorial design has an A-efficiency of100%
## Your fractional factorial design has a D-efficiency of100%
#> Your fractional factorial design has an A-efficiency of100%
#> Your fractional factorial design has a D-efficiency of100%

2.7. step 6

# Step 6.
# Two different card options
# Option 1
dce_modulo_examp1 <- modulo_method(
  fractional_factorial_kim,
  list(c(1, 0, 1), c(0,1,0))
)

# Option 2.
dce_modulo_examp2 <- modulo_method(
  fractional_factorial_kim,
  list(c(1, 0, 1), c(1,2,1), c(0,2,0))
)

2.8. step 7

# Step 7
# This experiment uses categorical data (not ordinal) hence there can be no pareto dominate solution.
# Each category is merely a choice.

2.9. step 8

# Step 8.
# Compare the efficiencies.
dce_efficency_menu_example1 <- dce_efficiency(akim_examp, dce_modulo_examp1)
## 
## q is1
##   L is3
##   Case 3
##   The implied x is1and y is0
##   s is3
## 
## q is2
##   L is3
##   Case 3
##   The implied x is1and y is0
##   s is3
## 
## q is3
##   L is3
##   Case 3
##   The implied x is1and y is0
##   s is3
## 
## 
## The D-efficiency of this discrete choice experiment is66.667%
dce_efficency_menu_example2 <- dce_efficiency(akim_examp, dce_modulo_examp2)
## 
## q is1
##   L is3
##   Case 3
##   The implied x is1and y is1
##   s is5
## 
## q is2
##   L is3
##   Case 3
##   The implied x is1and y is1
##   s is5
## 
## q is3
##   L is3
##   Case 3
##   The implied x is1and y is1
##   s is5
## 
## 
## The D-efficiency of this discrete choice experiment is80%

2.10. step 9

# Step 9
# Construct the question table
kim_question_table <- construct_question_frame(akim_examp, dce_modulo_examp1)
kim_question_table2 <- construct_question_frame(akim_examp, dce_modulo_examp2)

2.11. step 10

# Finally augment the question table. See Table 1 in the Theoretical Vignette.
levels(kim_question_table$length) <- c("1년 연장", "2년 연장", "3년연장")
levels(kim_question_table$wage) <- c("60%급여", "60%급여", "50%급여")
levels(kim_question_table$work) <- c("근로시간단축", "편한 직무로 보직전환", "계약제로 재고용")
kim_question_table
write.csv(kim_question_table, "kim_question_table.csv", row.names=TRUE, fileEncoding = 'cp949')

3. Conjoint

# install.packages("conjoint")
library(conjoint)
experiment<-expand.grid(length=c("1년 연장", "2년 연장", "3년연장"), 
                        wage=c("80%급여", "70%급여", "60%급여"), 
                        work=c("근로시간단축", "편한 직무로 보직전환", "계약제로 재고용")) 

design=caFactorialDesign(data=experiment,type="full") 
print(design) 
##      length    wage                 work
## 1  1년 연장 80%급여         근로시간단축
## 2  2년 연장 80%급여         근로시간단축
## 3   3년연장 80%급여         근로시간단축
## 4  1년 연장 70%급여         근로시간단축
## 5  2년 연장 70%급여         근로시간단축
## 6   3년연장 70%급여         근로시간단축
## 7  1년 연장 60%급여         근로시간단축
## 8  2년 연장 60%급여         근로시간단축
## 9   3년연장 60%급여         근로시간단축
## 10 1년 연장 80%급여 편한 직무로 보직전환
## 11 2년 연장 80%급여 편한 직무로 보직전환
## 12  3년연장 80%급여 편한 직무로 보직전환
## 13 1년 연장 70%급여 편한 직무로 보직전환
## 14 2년 연장 70%급여 편한 직무로 보직전환
## 15  3년연장 70%급여 편한 직무로 보직전환
## 16 1년 연장 60%급여 편한 직무로 보직전환
## 17 2년 연장 60%급여 편한 직무로 보직전환
## 18  3년연장 60%급여 편한 직무로 보직전환
## 19 1년 연장 80%급여      계약제로 재고용
## 20 2년 연장 80%급여      계약제로 재고용
## 21  3년연장 80%급여      계약제로 재고용
## 22 1년 연장 70%급여      계약제로 재고용
## 23 2년 연장 70%급여      계약제로 재고용
## 24  3년연장 70%급여      계약제로 재고용
## 25 1년 연장 60%급여      계약제로 재고용
## 26 2년 연장 60%급여      계약제로 재고용
## 27  3년연장 60%급여      계약제로 재고용
print(cor(caEncodedDesign(design)))
##        length wage work
## length      1    0    0
## wage        0    1    0
## work        0    0    1
library(conjoint)
library(grid)
library(gridExtra)

3.1. Step 1: Create orthogonal design and full choice set matrix

# declare features and their levels
experiment<-expand.grid(length=  c("A",
                                   "B",
                                       "C"),
                        wage =  c("A",
                                  "B",
                                              "C"),
                        work =  c("A",
                                  "B",
                                  "C"))
# create orthogonal design
design1=caFactorialDesign(data=experiment, 
                         type = "orthogonal")
print(design1)
##    length wage work
## 2       B    A    A
## 6       C    B    A
## 7       A    C    A
## 12      C    A    B
## 13      A    B    B
## 17      B    C    B
## 19      A    A    C
## 27      C    C    C
print(cor(caEncodedDesign(design1)))
##        length wage work
## length      1    0    0
## wage        0    1    0
## work        0    0    1
# produce alternatives via shifting

design1 <- data.frame(lapply(design1, 
                              function(x) 
                                as.numeric(x))) # make values numeric first

design2 <- design1
design2[design1 == 1] <- 2
design2[design1 == 2] <- 3
design2[design1 == 3] <- 1
  
design3 <- design2
design3[design2 == 1] <- 2
design3[design2 == 2] <- 3
design3[design2 == 3] <- 1
# combine the designs to get choice sets

fulldesign <- cbind(design1, design2, design3)


fulldesign_long <- data.frame() # create design in long format
for (i in 1: nrow(design1)){
  tmp <- data.frame(choice = i, 
                    stimulus = 1:3, 
                    rbind(design1[i, ], design2[i,], design3[i ,] ) # repeat 0 for no choice and 5 times for 5 variables
  )
  
  fulldesign_long <- rbind(fulldesign_long, tmp)
  
}

3.2. Step 2: Generate choice cards

# define factors for creating choice cards


## Design 1 (first option in choice sets)

design1$length <- factor(design1$length,
                           levels = c(1,2,3),
                         labels = c("1년",
                                    "2년",
                                    "3년"))



design1$wage <- factor(design1$wage,
                        levels = c(1,2,3),
                        labels = c("70% 급여",
                                   "60% 급여",
                                   "60% 급여"))



design1$work <- factor(design1$work,
                         levels = c(1,2,3),
                         labels = c("근로시간 단축",
                                    "편한 보직으로 전환",
                                    "계약직으로 재고용"))
## Design 2 (second option in choice sets)

design2$length <- factor(design2$length,
                           levels = c(1,2,3),
                         labels = c("1년",
                                    "2년",
                                    "3년"))



design2$wage <- factor(design2$wage,
                        levels = c(1,2,3),
                        labels = c("70% 급여",
                                   "60% 급여",
                                   "60% 급여"))



design2$work <- factor(design2$work,
                         levels = c(1,2,3),
                         labels = c("근로시간 단축",
                                    "편한 보직으로 전환",
                                    "계약직으로 재고용"))
## Design 3 (third option in choice sets)
design3$length <- factor(design3$length,
                           levels = c(1,2,3),
                         labels = c("1년",
                                    "2년",
                                    "3년"))



design3$wage <- factor(design3$wage,
                        levels = c(1,2,3),
                        labels = c("70% 급여",
                                   "60% 급여",
                                   "60% 급여"))



design3$work <- factor(design3$work,
                         levels = c(1,2,3),
                         labels = c("근로시간 단축",
                                    "편한 보직으로 전환",
                                    "계약직으로 재고용"))

3.3. Create choice set tables that are the choice cards

# example for first card

attr <- c("연장기간", 
          "급여수준", 
          "근로방식")


choice_tab <- data.frame(cbind(attr,
                               t(design1[4, ]),
                               t(design2[4, ]),
                               t(design3[4, ]))
                               ) 

colnames(choice_tab) <- c(" ",
                          "A 유형", 
                          "B 유형",
                          "C 유형")
# grid.table(choice_tab, theme = mytheme)

g <- tableGrob(choice_tab,
               cols = colnames(choice_tab), 
               rows = NULL,
               theme = ttheme_default(base_size = 12, 
                                      core = list(fg_params = list(hjust=c(rep(0,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)), 
                                                                   x=c(rep(0.1,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)), 
                                                                   fontsize=9),
                                                  bg_params = list(fill = c("grey80", "grey80", "grey80", "grey80",   # first column
                                                                            "grey95", "grey90", "grey95", "grey90",   # second column
                                                                            "grey95", "grey90", "grey95", "grey90") )   # third column
                                      ), # end core params
                                      colhead = list(bg_params=list(fill = "lightskyblue1"),
                                                     fg_params=list(fontsize = 10))
               ) # end theme
)# end tableGrob



g$widths <- unit(c(0.10, 0.15, 0.15, 0.15), "npc")

grid.draw(g)

# Test image

png(paste0("Kim_test.png"), width=10, height=3,units="cm", res=400)

grid.draw(g)

# dev.off()

3.4. Loop for creating all choice set cards

#### Step 4: Loop for creating all choice set cards (this will write image files) ####

# setwd("E:/Conjoint4")
for (i in 1:nrow(design1)) {
  
  print(paste("Kim' Conjoint: ", i))
  
  
  
  attr <- c("연장기간", 
          "급여수준", 
          "근로방식")
  
  choice_tab <- data.frame(cbind(attr,
                               t(design1[i, ]),
                               t(design2[i, ]),
                               t(design3[i, ]))
                               ) 

  colnames(choice_tab) <- c(" ",
                          "A 유형", 
                          "B 유형",
                          "C 유형")

  
  png(paste0("Kim's_Conjoint", i, ".png"), width=10, height=3,units="cm", res=400)
  
  g <- tableGrob(choice_tab,
               cols = colnames(choice_tab), 
               rows = NULL,
               theme = ttheme_default(base_size = 12, 
                                      core = list(fg_params = list(hjust=c(rep(0,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)), 
                                                                   x=c(rep(0.1,4), rep(0.5,4), rep(0.5,4), rep(0.5,4)), 
                                                                   fontsize=9),
                                                  bg_params = list(fill = c("grey80", "grey80", "grey80", "grey80",   # first column
                                                                            "grey95", "grey90", "grey95", "grey90",   # second column
                                                                            "grey95", "grey90", "grey95", "grey90") )   # third column
                                      ), # end core params
                                      colhead = list(bg_params=list(fill = "lightskyblue1"),
                                                     fg_params=list(fontsize = 10))
               ) # end theme
  )# end tableGrob
  
  
  
  g$widths <- unit(c(0.25, 0.25, 0.25, 0.25), "npc")
  
  grid.draw(g)
  
  
  dev.off()
  
}
## [1] "Kim' Conjoint:  1"
## [1] "Kim' Conjoint:  2"
## [1] "Kim' Conjoint:  3"
## [1] "Kim' Conjoint:  4"
## [1] "Kim' Conjoint:  5"
## [1] "Kim' Conjoint:  6"
## [1] "Kim' Conjoint:  7"
## [1] "Kim' Conjoint:  8"

4. 분석

library(conjoint)
library(skpr)
library(AlgDesign)
library(mlogit)
library(texreg)
library(dotwhisker)
library(ggplot2)
library(jtools)
library(grid)
library(gridExtra)
#### Step 1: Create orthogonal design and full choice set matrix #####

experiment<-expand.grid(length=  c("A",
                                   "B",
                                       "C"),
                        wage =  c("A",
                                  "B",
                                              "C"),
                        work =  c("A",
                                  "B",
                                  "C"))
# create orthogonal design
design1=caFactorialDesign(data=experiment, 
                          type = "orthogonal")
print(design1)
##    length wage work
## 2       B    A    A
## 6       C    B    A
## 7       A    C    A
## 12      C    A    B
## 13      A    B    B
## 17      B    C    B
## 19      A    A    C
## 27      C    C    C
print(cor(caEncodedDesign(design1)))
##        length wage work
## length      1    0    0
## wage        0    1    0
## work        0    0    1
# produce alternatives via shifting

design1 <- data.frame(lapply(design1, 
                             function(x) 
                               as.numeric(x)))

design2 <- design1
design2[design1 == 1] <- 2
design2[design1 == 2] <- 3
design2[design1 == 3] <- 1

design3 <- design2
design3[design2 == 1] <- 2
design3[design2 == 2] <- 3
design3[design2 == 3] <- 1



# combine the designs to get choice sets

fulldesign <- cbind(design1, design2, design3)


fulldesign_long <- data.frame()
for (i in 1: nrow(design1)){
  tmp <- data.frame(choice = i, 
                    stimulus = 1:4, 
                    rbind(design1[i, ], design2[i,], design3[i ,], rep(0, 5) ) # repeat 0 for no choice and 5 times for 5 variables
  )
  
  fulldesign_long <- rbind(fulldesign_long, tmp)
  
}
library(readxl)
df=read_excel("df.xlsx")

크론바흐알파

효과성

psych::alpha(df[c("effect1", 
                       "effect2",
                       "effect3", 
                       "effect4",
                       "effect5",
                        "need1") ],
             check.keys=TRUE)
## 
## Reliability analysis   
## Call: psych::alpha(x = df[c("effect1", "effect2", "effect3", "effect4", 
##     "effect5", "need1")], check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.96      0.96    0.96      0.81  26 0.003  2.3 0.99      0.8
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.96  0.96  0.97
## Duhachek  0.96  0.96  0.97
## 
##  Reliability if an item is dropped:
##         raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## effect1      0.95      0.95    0.95      0.81  21   0.0036 0.0014  0.80
## effect2      0.95      0.95    0.95      0.81  21   0.0036 0.0014  0.81
## effect3      0.95      0.95    0.95      0.80  20   0.0038 0.0011  0.80
## effect4      0.95      0.95    0.95      0.80  20   0.0038 0.0012  0.79
## effect5      0.96      0.96    0.96      0.82  23   0.0033 0.0015  0.81
## need1        0.96      0.96    0.95      0.82  23   0.0034 0.0018  0.81
## 
##  Item statistics 
##           n raw.r std.r r.cor r.drop mean  sd
## effect1 400  0.92  0.92  0.91   0.89  2.3 1.1
## effect2 400  0.92  0.92  0.90   0.88  2.3 1.0
## effect3 400  0.93  0.93  0.92   0.90  2.3 1.1
## effect4 400  0.93  0.93  0.92   0.90  2.3 1.1
## effect5 400  0.89  0.89  0.86   0.85  2.2 1.1
## need1   400  0.90  0.90  0.87   0.86  2.3 1.1
## 
## Non missing response frequency for each item
##            1    2    3    4    5 miss
## effect1 0.29 0.27 0.30 0.12 0.02    0
## effect2 0.28 0.32 0.27 0.12 0.01    0
## effect3 0.28 0.28 0.30 0.11 0.02    0
## effect4 0.30 0.26 0.30 0.12 0.02    0
## effect5 0.31 0.28 0.28 0.11 0.02    0
## need1   0.30 0.24 0.29 0.14 0.03    0
# generate mean of all items 
df$effect <- apply(df[c("effect1", 
                       "effect2",
                       "effect3", 
                       "effect4",
                       "effect5",
                       "need1") ], 1, mean, na.rm=TRUE)

hist(df$effect)

필요성

psych::alpha(df[c("need2",
                       "need3", 
                       "need4",
                       "need5") ],
             check.keys=TRUE)
## 
## Reliability analysis   
## Call: psych::alpha(x = df[c("need2", "need3", "need4", "need5")], check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean  sd median_r
##       0.95      0.95    0.94      0.84  20 0.0039  2.6 1.2     0.83
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.94  0.95  0.96
## Duhachek  0.95  0.95  0.96
## 
##  Reliability if an item is dropped:
##       raw_alpha std.alpha G6(smc) average_r S/N alpha se   var.r med.r
## need2      0.93      0.93    0.90      0.82  14   0.0059 0.00069  0.82
## need3      0.93      0.93    0.90      0.82  14   0.0060 0.00090  0.81
## need4      0.94      0.94    0.92      0.84  16   0.0053 0.00186  0.82
## need5      0.95      0.95    0.93      0.86  19   0.0044 0.00049  0.85
## 
##  Item statistics 
##         n raw.r std.r r.cor r.drop mean  sd
## need2 400  0.95  0.95  0.93   0.91  2.6 1.3
## need3 400  0.95  0.95  0.93   0.91  2.6 1.3
## need4 400  0.93  0.93  0.90   0.88  2.6 1.3
## need5 400  0.91  0.91  0.87   0.85  2.6 1.2
## 
## Non missing response frequency for each item
##          1    2    3    4    5 miss
## need2 0.29 0.18 0.25 0.23 0.05    0
## need3 0.30 0.17 0.25 0.22 0.06    0
## need4 0.30 0.17 0.26 0.20 0.07    0
## need5 0.28 0.19 0.26 0.22 0.05    0
df$need <- apply(df[c("need2",
                       "need3", 
                       "need4",
                       "need5") ], 1, mean, na.rm=TRUE)

hist(df$need)

K cluster analysis

library(NbClust)
# criteria <- NbClust(data = df[  , c("effect", "need") ], diss = NULL, distance = "euclidean", min.nc = 2, max.nc = 10, method = "kmeans", index = "all", alphaBeale = 0.1)


# K-Means Cluster Analysis
kclust <- kmeans( df[  ,  c("effect", "need") ] , 
                 3, 
                 nstart=10 ) # 3 cluster solution

df$kgroup <- kclust$cluster


# get cluster means (shown in Annex A7)
aggregate(df[, c("effect", "need")], 
          by=list(df$kgroup),
          FUN=mean)
table(df$kgroup)
## 
##   1   2   3 
##  75 163 162
table(df$kgroup, df$effect)
##    
##      1 1.16666666666667 1.33333333333333 1.5 1.66666666666667 1.83333333333333
##   1  0                0                0   0                0                0
##   2  8                0                1   2                4                2
##   3 71               10               14  10                6               15
##    
##      2 2.16666666666667 2.33333333333333 2.5 2.66666666666667 2.83333333333333
##   1  1                0                0   0                3                4
##   2 14                9               15  21               13               17
##   3 24                5                4   2                1                0
##    
##      3 3.16666666666667 3.33333333333333 3.5 3.66666666666667 3.83333333333333
##   1  7                3                7   9                6                6
##   2 46                6                5   0                0                0
##   3  0                0                0   0                0                0
##    
##      4 4.16666666666667 4.33333333333333 4.66666666666667 4.83333333333333  5
##   1 19                1                2                1                1  5
##   2  0                0                0                0                0  0
##   3  0                0                0                0                0  0
table(df$kgroup, df$need)
##    
##      1 1.25 1.5 1.75  2 2.25 2.5 2.75  3 3.25 3.5 3.75  4 4.25 4.5 4.75  5
##   1  0    0   0    0  0    0   0    0  0    2   5   14 30    5   2    8  9
##   2  0    0   0    0  8    8  15   16 58   22   9   11 13    1   1    0  1
##   3 88   13  14   16 24    6   1    0  0    0   0    0  0    0   0    0  0
# kgroup3: effect low, need low
# kgroup2: effect mid, need mid
# kgroup1: effect high, need high
df$set1 <- 4
df$set1[df$v21 == "A유형"] <- 1
df$set1[df$v21 == "B유형"] <- 2
df$set1[df$v21 == "C유형"] <- 3

df$set2 <- 4
df$set2[df$v22 == "A유형"] <- 1
df$set2[df$v22 == "B유형"] <- 2
df$set2[df$v22 == "C유형"] <- 3

df$set3 <- 4
df$set3[df$v23 == "A유형"] <- 1
df$set3[df$v23 == "B유형"] <- 2
df$set3[df$v23 == "C유형"] <- 3

df$set4 <- 4
df$set4[df$v24 == "A유형"] <- 1
df$set4[df$v24 == "B유형"] <- 2
df$set4[df$v24 == "C유형"] <- 3

df$set5 <- 4
df$set5[df$v25 == "A유형"] <- 1
df$set5[df$v25 == "B유형"] <- 2
df$set5[df$v25 == "C유형"] <- 3

df$set6 <- 4
df$set6[df$v26 == "A유형"] <- 1
df$set6[df$v26 == "B유형"] <- 2
df$set6[df$v26 == "C유형"] <- 3

df$set7 <- 4
df$set7[df$v27 == "A유형"] <- 1
df$set7[df$v27 == "B유형"] <- 2
df$set7[df$v27 == "C유형"] <- 3

df$set8 <- 4
df$set8[df$v28 == "A유형"] <- 1
df$set8[df$v28 == "B유형"] <- 2
df$set8[df$v28 == "C유형"] <- 3
#### Transform response data into required format ####

df$respondentID <- 1:nrow(df)

min_frame <- df

r <- 8 # set number of choice sets
k <- 4 # set number of stimuli per choice set


minl_frame <- data.frame() # define empty dataframe

for (i in 1:nrow(min_frame)){ # build choice data frames for every respondent and append them.
  
  tmp <- data.frame(respondentID = min_frame$respondentID[i],
                     choice = rep(1:r, each=k), 
                     stimulus = rep(1:k, r)
                     )
  
  tmp_choice <- rep(0, r*k)
  # get indices: which 
  get_indices <- c(min_frame$set1[i],
    min_frame$set2[i],
    min_frame$set3[i],
    min_frame$set4[i],
    min_frame$set5[i],
    min_frame$set6[i],
    min_frame$set7[i],
    min_frame$set8[i]
  ) + rep(k, r)*(0:(r-1)) # take into account that indices add up in steps of 4
  
  tmp_choice[get_indices] <- 1
  
  tmp <- data.frame(cbind(tmp, tmp_choice))
  minl_frame <- rbind(minl_frame, tmp)
  
}
#### generate dummy variables ####

# length연장기간
fulldesign_long$length1 <- 0
fulldesign_long$length1[fulldesign_long$length == 1] <- 1

fulldesign_long$length2 <- 0
fulldesign_long$length2[fulldesign_long$length == 2] <- 1

fulldesign_long$length3 <- 0
fulldesign_long$length3[fulldesign_long$length == 3] <- 1


# wage 급여수준

fulldesign_long$wage1 <- 0
fulldesign_long$wage1[fulldesign_long$wage == 1] <- 1

fulldesign_long$wage2 <- 0
fulldesign_long$wage2[fulldesign_long$wage == 2] <- 1

fulldesign_long$wage3 <- 0
fulldesign_long$wage3[fulldesign_long$wage == 3] <- 1

# work 근로방법

fulldesign_long$work1 <- 0
fulldesign_long$work1[fulldesign_long$work == 1] <- 1

fulldesign_long$work2 <- 0
fulldesign_long$work2[fulldesign_long$work == 2] <- 1

fulldesign_long$work3 <- 0
fulldesign_long$work3[fulldesign_long$work == 3] <- 1



# none
fulldesign_long$none <- 0
fulldesign_long$none[fulldesign_long$stimulus == 4] <- 1



#### generate effect-coded variables ####

# length연장기간

fulldesign_long$lengthE2 <- 0
fulldesign_long$lengthE2[fulldesign_long$length == 1] <- -1
fulldesign_long$lengthE2[fulldesign_long$length == 2] <- 1

fulldesign_long$lengthE3 <- 0
fulldesign_long$lengthE3[fulldesign_long$length == 1] <- -1
fulldesign_long$lengthE3[fulldesign_long$length == 3] <- 1



# wage 급여수준

fulldesign_long$wageE2 <- 0
fulldesign_long$wageE2[fulldesign_long$wage == 1] <- -1
fulldesign_long$wageE2[fulldesign_long$wage == 2] <- 1

fulldesign_long$wageE3 <- 0
fulldesign_long$wageE3[fulldesign_long$wage == 1] <- -1
fulldesign_long$wageE3[fulldesign_long$wage == 3] <- 1



# work 근로방법

fulldesign_long$workE2 <- 0
fulldesign_long$workE2[fulldesign_long$work == 1] <- -1
fulldesign_long$workE2[fulldesign_long$work == 2] <- 1

fulldesign_long$workE3 <- 0
fulldesign_long$workE3[fulldesign_long$work == 1] <- -1
fulldesign_long$workE3[fulldesign_long$work == 3] <- 1


# none
fulldesign_long$noneE <- -1
fulldesign_long$noneE[fulldesign_long$stimulus == 4] <- 1




# match the reshaped data.frame with the choice designs

merged_data <- merge(minl_frame, fulldesign_long, by = c("choice", "stimulus"))
merged_data <- merged_data[order(merged_data$respondentID), ]
# prepare individual level variables to be merged with the choice data frame

tmp <- df[c("respondentID", 
              "peak",
              "gender", 
              "affiliation",
              "age",
              "income",
              "experiment",
              "kgroup")] # generaged clustervariable 

merged_data <- merge(merged_data, tmp, by = "respondentID") # merge individual-level variables with the choice data frame
# convert data for mlogit

cbc <- mlogit.data(merged_data, choice="tmp_choice", shape="long", alt.var="stimulus", id.var = "respondentID")

Step 3: Analysis using multinomial logit models

#### partworth model with dummy coding ####

ml1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc)
summary(ml1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc, method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29031 0.25500 0.24938 0.20531 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 4.64E-08 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.376320   0.058267   6.4586 1.057e-10 ***
## length3  0.663826   0.055910  11.8731 < 2.2e-16 ***
## wage2   -0.593390   0.046870 -12.6603 < 2.2e-16 ***
## wage3   -1.236786   0.059698 -20.7174 < 2.2e-16 ***
## work2   -0.188861   0.051181  -3.6901 0.0002242 ***
## work3   -0.457528   0.053572  -8.5404 < 2.2e-16 ***
## none    -0.548724   0.066735  -8.2224 2.220e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -3991
# wordreg(  ml1,   file = "model 1.doc")
round(exp(coef(ml1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.457   1.942   0.552   0.290   0.828   0.633   0.578
  • 이것은 바로위 결과값을 지수변환한 값임
  • length2 0.3763(계수값, Estimate)을 지수변환 값으로 1.457로 나타남
  • 이렇게 변환하면, 해석하기 용이하게 됨. 소위 오즈비(odds ration)를 나타내는데, 이것은 우리말로 승산비라고도 함
  • 여기서, 1.457은 사람들이 기본옵션(1년 연장)에 비해, 2년 연장을 1.457배 더 선호한다고 해석할 수 있음. 다르게 표현하면 1년연장에 비해 2년연장을 선택할 가능성이 45.7%(1.457-1=0.457) 높아진다고 해석할 수 있음
  • 만약 임금(wage2) 오즈비 0.552는 다음과 같이 해석할 수 있음. 80% 임금 수준에 비해 70% 임금수준을 0.552배 선호한다. 즉, 80%임금에 비해 70%임금을 선택할 확률은 44.8%(0.552-1=-0.448, 44.8%낮아짐) 낮아진다고 해석할 수 있음

Point predictions for main model

Choose A옵션(1년 연장+급여 60%감축+계약직) versus B옵션(1년연장+급여80%갑축 + 시간단축) ; 고용연장 1년으로 고정

tmp_ml <- ml1
p_pred_a <- exp(  tmp_ml$coefficients[4] + # 60% 급여
                      tmp_ml$coefficients[6] )  /  # 계약직 
                                                        ( exp( tmp_ml$coefficients[4] + # 60%급여
                                                            tmp_ml$coefficients[6] ) + # 계약직
                                                           exp(  tmp_ml$coefficients[4]*0 + # 80%급여
                                                                   tmp_ml$coefficients[6]*0 ) # 시간단축
                                                        )

p_pred_a # get predicted probability of choosing option A versus option B
##     wage3 
## 0.1552093
  • B옵션(급여80%갑축 + 시간단축) 보다 A옵션(급여 60%감축+계약직)을 선택할 확률은 15.5%

calculate confidence intervals

library(MASS)
## 
## 다음의 패키지를 부착합니다: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
## The following object is masked from 'package:dfidx':
## 
##     select
m <- ml1
est_betas <- m$coefficients

sim_betas <- mvrnorm(1000, m$coefficients, vcov(m))

sim_ci <- data.frame()

  sim_preds <- apply(sim_betas, 1, function(x) {
    m$coefficients <- x
    exp(  m$coefficients[4] + # explainable AI
            m$coefficients[6] )  /  # costs2
      ( exp( m$coefficients[4] + # explainable AI
               m$coefficients[6] ) + # costs 2
          exp(  m$coefficients[4]*0 + # NO explainable AI
                  m$coefficients[6]*0 ) # NO costs
      )
    
  })
  
  sim_ci_a <- rbind(sim_ci, quantile(sim_preds, c(.025, .975)) )


colnames(sim_ci_a) <- c("lo", "hi")

sim_ci_a
  • (B)보다 (A)를 선택할 확률은 15.5%이며 13.7%~17.6% 사이의 95% 신뢰 구간(몬테카를로 시뮬레이션을 사용하여 추정)입니다.
# prepare results for plotting

modelframe <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  ml1$coefficients[1], ml1$coefficients[2],
                ml1$coefficients[3], ml1$coefficients[4],
                ml1$coefficients[5], ml1$coefficients[6],
                ml1$coefficients[7]),
  std.error = c( coef(summary(ml1))[1, "Std. Error"], coef(summary(ml1))[2, "Std. Error"], 
                  coef(summary(ml1))[3, "Std. Error"], coef(summary(ml1))[4, "Std. Error"], 
                  coef(summary(ml1))[5, "Std. Error"], coef(summary(ml1))[6, "Std. Error"], 
                  coef(summary(ml1))[7, "Std. Error"]
  )
)



# https://cran.r-project.org/web/packages/dotwhisker/vignettes/dotwhisker-vignette.html

dwplot(modelframe)

# define additional labels in plot (bracketing the feature levels)
three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 



p1 <- {dwplot(modelframe, 
        vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1)) + 
    theme_bw() + scale_colour_grey() + theme(legend.position = "none") + xlab("Estimated partworth utilities") + ylab("")  } %>%   add_brackets(three_brackets) 

p1

png("Fig_1.png", width=18, height=18,units="cm", res=400)

p1



pdf(paste0("Fig_1.pdf"), width=7, height=7)

p1

dev.off()
## png 
##   2
#### Model with effects coding (shown in Annex A5 and A6) ####

ml1e <- mlogit(tmp_choice ~ lengthE2 + lengthE3 +
                wageE2 + wageE3 +
                workE2 + workE3 +
                none | 0, cbc)


summary(ml1e)
## 
## Call:
## mlogit(formula = tmp_choice ~ lengthE2 + lengthE3 + wageE2 + 
##     wageE3 + workE2 + workE3 + none | 0, data = cbc, method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29031 0.25500 0.24938 0.20531 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 4.64E-08 
## gradient close to zero 
## 
## Coefficients :
##           Estimate Std. Error  z-value  Pr(>|z|)    
## lengthE2  0.029605   0.030611   0.9671    0.3335    
## lengthE3  0.317111   0.029110  10.8937 < 2.2e-16 ***
## wageE2    0.016669   0.031886   0.5228    0.6011    
## wageE3   -0.626727   0.038372 -16.3330 < 2.2e-16 ***
## workE2    0.026602   0.031311   0.8496    0.3955    
## workE3   -0.242065   0.032617  -7.4215 1.157e-13 ***
## none     -0.069918   0.045667  -1.5310    0.1258    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -3991
# wordreg(  ml1e,   file = "model 1E.doc")


covMatrix <- vcov(ml1e)
sqrt(diag(covMatrix))
##   lengthE2   lengthE3     wageE2     wageE3     workE2     workE3       none 
## 0.03061134 0.02910965 0.03188605 0.03837193 0.03131058 0.03261671 0.04566664
modelframe <- data.frame(term = c(
  '1yr',
  '2yr',
  '3yr',
  '80%',
  '70%',
  '60%',
  'reduction',
  'transfer',
  'part-time',
  'No-choice option'),
  estimate = c( -(ml1e$coefficients[1] + ml1e$coefficients[2]), ml1e$coefficients[1], ml1e$coefficients[2],
                -(ml1e$coefficients[3] + ml1e$coefficients[4]), ml1e$coefficients[3], ml1e$coefficients[4],
                -(ml1e$coefficients[5] + ml1e$coefficients[6]), ml1e$coefficients[5], ml1e$coefficients[6],
                  ml1e$coefficients[7] ),
  std.error = c( sqrt(sum(covMatrix[1:2, 1:2])), coef(summary(ml1e))[1, "Std. Error"], coef(summary(ml1e))[2, "Std. Error"], 
                 sqrt(sum(covMatrix[3:4, 3:4])), coef(summary(ml1e))[3, "Std. Error"], coef(summary(ml1e))[4, "Std. Error"], 
                 sqrt(sum(covMatrix[5:6, 5:6])), coef(summary(ml1e))[5, "Std. Error"], coef(summary(ml1e))[6, "Std. Error"], 
                 coef(summary(ml1e))[7, "Std. Error"]
                 )
                        )


modelframe # get estimates (for Annex A5)
dwplot(modelframe)

# define additional labels in plot (bracketing the feature levels)
three_brackets <- list(c("length", 
                        "1yr", 
                        "3yr"), 
                      c("wage",
                        '80%',
                        '60%'),
                      c("work", 
                        'reduction',
                        'part-time')
) 



p1e <- {dwplot(modelframe, 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1)) + 
    theme_bw() + scale_colour_grey() + theme(legend.position = "none") + xlab("Estimated partworth utilities") + ylab("") } %>%   add_brackets(three_brackets)


p1e

png("Fig_1e.png", width=18, height=18,units="cm", res=400)

p1e



pdf(paste0("Fig_1e.pdf"), width=7, height=7)

p1e

dev.off()
## png 
##   2

Estimated partworths by clusters

ml1g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$kgroup == 1, ])


summary(ml1g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$kgroup == 1, ], 
##     method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.31333 0.31167 0.31500 0.06000 
## 
## nr method
## 5 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.000115 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error z-value  Pr(>|z|)    
## length2  0.437690   0.129445  3.3813 0.0007215 ***
## length3  0.952692   0.118585  8.0338 8.882e-16 ***
## wage2   -0.406232   0.101233 -4.0129 5.999e-05 ***
## wage3   -0.921885   0.120271 -7.6651 1.776e-14 ***
## work2   -0.028530   0.112559 -0.2535 0.7999066    
## work3   -0.067461   0.113677 -0.5934 0.5528847    
## none    -1.511329   0.207903 -7.2694 3.610e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -676.18
ml1g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$kgroup == 2, ])
summary(ml1g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$kgroup == 2, ], 
##     method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.33052 0.28374 0.28298 0.10276 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 4.67E-07 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.444100   0.085249   5.2094 1.894e-07 ***
## length3  0.627066   0.084044   7.4612 8.571e-14 ***
## wage2   -0.584567   0.069957  -8.3561 < 2.2e-16 ***
## wage3   -1.169597   0.086657 -13.4969 < 2.2e-16 ***
## work2   -0.244641   0.075023  -3.2609  0.001111 ** 
## work3   -0.601871   0.080559  -7.4711 7.949e-14 ***
## none    -1.397780   0.117535 -11.8924 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1525.2
ml1g3 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$kgroup == 3, ])
summary(ml1g3)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$kgroup == 3, ], 
##     method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.23920 0.19985 0.18519 0.37577 
## 
## nr method
## 5 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.00105 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.235455   0.102764   2.2912   0.02195 *  
## length3  0.501524   0.098107   5.1120 3.187e-07 ***
## wage2   -0.745044   0.082524  -9.0282 < 2.2e-16 ***
## wage3   -1.634117   0.117942 -13.8553 < 2.2e-16 ***
## work2   -0.202816   0.090820  -2.2332   0.02554 *  
## work3   -0.537365   0.095018  -5.6554 1.555e-08 ***
## none     0.057760   0.103571   0.5577   0.57706    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1563.1
# Prepare results for plotting group 1
tmp <- ml1g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##   length2   length3     wage2     wage3     work2     work3      none 
## 0.1294455 0.1185849 0.1012326 0.1202706 0.1125585 0.1136765 0.2079033
modelframe2a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe2a$term <- as.character(modelframe2a$term)
dwplot(modelframe2a)

# Prepare results for plotting group 2
tmp <- ml1g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.08524922 0.08404411 0.06995710 0.08665698 0.07502342 0.08055944 0.11753543
modelframe2b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe2b$term <- as.character(modelframe2b$term)
dwplot(modelframe2b)

# Prepare results for plotting group 3
tmp <- ml1g3

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.10276389 0.09810679 0.08252447 0.11794157 0.09082015 0.09501760 0.10357076
modelframe2c <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)


modelframe2c$term <- as.character(modelframe2c$term)
dwplot(modelframe2c)

# combine results for all three groups into one data frame for plotting

modelframe_allg <- rbind(modelframe2a, modelframe2b, modelframe2c) # dataframe with outputs for all groups

modelframe_allg$model <- rep(c("Group 1", "Group 2", "Group 3"), each = 7)

modelframe_allg <- modelframe_allg[order(modelframe_allg$model, decreasing = T), ]


modelframe_allg$model <- factor (modelframe_allg$model, 
                              levels = c("Group 1","Group 2","Group 3"), 
                              labels = c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p2 <- {dwplot( modelframe_allg , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow"),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "Group", breaks=c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow") ) + 
        scale_shape_discrete(name = "Group", breaks = c("Group 1: Effect&Need\nHigh", "Group 2:Effect&Need\nMid","Group 3: Effect&Need\nLow") )  +
    guides(
      shape = guide_legend("Group"), 
      colour = guide_legend("Group"),
      size = guide_legend("Group")
    ) }   %>% 
  add_brackets(three_brackets)

p2

png("Fig_2.png", width=18, height=18,units="cm", res=400)

p2



pdf(paste0("Fig_2.pdf"), width=7, height=7)

p2

dev.off()
## png 
##   2
round(exp(coef(ml1g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.549   2.593   0.666   0.398   0.972   0.935   0.221
round(exp(coef(ml1g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.559   1.872   0.557   0.310   0.783   0.548   0.247
round(exp(coef(ml1g3)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.265   1.651   0.475   0.195   0.816   0.584   1.059

Estimated partworths by 성별

table(df$gender)
## 
## 남성 여성 
##  308   92
ml2g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$gender == "남성", ])


summary(ml2g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$gender == "남성", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.28693 0.25122 0.24269 0.21916 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 1.14E-07 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.347450   0.066948   5.1898 2.105e-07 ***
## length3  0.662683   0.063861  10.3770 < 2.2e-16 ***
## wage2   -0.588388   0.053801 -10.9364 < 2.2e-16 ***
## wage3   -1.227969   0.068507 -17.9248 < 2.2e-16 ***
## work2   -0.179668   0.058811  -3.0550  0.002251 ** 
## work3   -0.439356   0.061377  -7.1583 8.167e-13 ***
## none    -0.465283   0.075568  -6.1572 7.405e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -3090.6
ml2g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$gender == "여성", ])
summary(ml2g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$gender == "여성", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.30163 0.26766 0.27174 0.15897 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.00425 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.465919   0.118515   3.9313 8.448e-05 ***
## length3  0.667580   0.115778   5.7660 8.116e-09 ***
## wage2   -0.610465   0.095616  -6.3846 1.719e-10 ***
## wage3   -1.266244   0.121800 -10.3961 < 2.2e-16 ***
## work2   -0.217646   0.104077  -2.0912   0.03651 *  
## work3   -0.515823   0.109930  -4.6923 2.702e-06 ***
## none    -0.863150   0.144104  -5.9898 2.102e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -892.94
# Prepare results for plotting group 1
tmp <- ml2g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.06694830 0.06386074 0.05380072 0.06850672 0.05881146 0.06137699 0.07556754
modelframe3a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe3a$term <- as.character(modelframe3a$term)
dwplot(modelframe3a)

# Prepare results for plotting group 2
tmp <- ml2g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.11851490 0.11577823 0.09561584 0.12179964 0.10407750 0.10992980 0.14410440
modelframe3b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe3b$term <- as.character(modelframe3b$term)
dwplot(modelframe3b)

# combine results for all three groups into one data frame for plotting

modelframe_allg2 <- rbind(modelframe3a, modelframe3b) # dataframe with outputs for all groups

modelframe_allg2$model <- rep(c("male", "female"), each = 7)

modelframe_allg2 <- modelframe_allg2[order(modelframe_allg2$model, decreasing = T), ]


modelframe_allg2$model <- factor (modelframe_allg2$model, 
                              levels = c("male", "female"), 
                              labels = c("male", "female"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p3 <- {dwplot( modelframe_allg2 , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c(
                "male", "female"
              ),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "Gender", breaks=c("male", "female") ) + 
        scale_shape_discrete(name = "Gender", breaks = c("male", "female") )  +
    guides(
      shape = guide_legend("Gender"), 
      colour = guide_legend("Gender"),
      size = guide_legend("Gender")
    ) }   %>% 
  add_brackets(three_brackets)


p3

png("Fig_3.png", width=18, height=18,units="cm", res=400)

p3



pdf(paste0("Fig_3.pdf"), width=7, height=7)

p3

dev.off()
## png 
##   2
round(exp(coef(ml2g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.415   1.940   0.555   0.293   0.836   0.644   0.628
round(exp(coef(ml2g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.593   1.950   0.543   0.282   0.804   0.597   0.422

Estimated partworths by 사기업/공기업

table(df$affiliation)
## 
## 공기업 사기업 
##    200    200
ml4g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$affiliation == "사기업", ])


summary(ml4g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$affiliation == "사기업", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29812 0.27563 0.25562 0.17062 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 2.46E-08 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.479732   0.081593   5.8796 4.114e-09 ***
## length3  0.687576   0.079725   8.6244 < 2.2e-16 ***
## wage2   -0.689575   0.065888 -10.4658 < 2.2e-16 ***
## wage3   -1.320166   0.083529 -15.8050 < 2.2e-16 ***
## work2   -0.250940   0.072219  -3.4747 0.0005114 ***
## work3   -0.474703   0.074732  -6.3521 2.124e-10 ***
## none    -0.796195   0.096790  -8.2260 2.220e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1937.7
ml4g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$affiliation == "공기업", ])
summary(ml4g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$affiliation == "공기업", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.28250 0.23438 0.24313 0.24000 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 3.83E-07 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.268009   0.083624   3.2049 0.0013509 ** 
## length3  0.643319   0.078607   8.1839 2.220e-16 ***
## wage2   -0.492007   0.066977  -7.3459 2.043e-13 ***
## wage3   -1.148013   0.085567 -13.4166 < 2.2e-16 ***
## work2   -0.127072   0.072859  -1.7441 0.0811458 .  
## work3   -0.442535   0.077163  -5.7351 9.748e-09 ***
## none    -0.322347   0.093111  -3.4620 0.0005363 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -2035.2
# Prepare results for plotting group 1
tmp <- ml4g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.08159318 0.07972498 0.06588812 0.08352859 0.07221886 0.07473156 0.09678964
modelframe4a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe4a$term <- as.character(modelframe4a$term)
dwplot(modelframe4a)

# Prepare results for plotting group 2
tmp <- ml4g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.08362370 0.07860744 0.06697704 0.08556688 0.07285906 0.07716329 0.09311143
modelframe4b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe4b$term <- as.character(modelframe4b$term)
dwplot(modelframe4b)

# combine results for all three groups into one data frame for plotting

modelframe_allg4 <- rbind(modelframe4a, modelframe4b) # dataframe with outputs for all groups

modelframe_allg4$model <- rep(c("private", "public"), each = 7)

modelframe_allg4 <- modelframe_allg4[order(modelframe_allg4$model, decreasing = T), ]


modelframe_allg4$model <- factor (modelframe_allg4$model, 
                              levels = c("private", "public"), 
                              labels = c("private", "public"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p4 <- {dwplot( modelframe_allg4 , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c(
                "private", "public"
              ),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "Affiliation", breaks=c("private", "public") ) + 
        scale_shape_discrete(name = "Affiliation", breaks = c("private", "public") )  +
    guides(
      shape = guide_legend("Affiliation"), 
      colour = guide_legend("Affiliation"),
      size = guide_legend("Affiliation")
    ) }   %>% 
  add_brackets(three_brackets)

p4

png("Fig_4.png", width=18, height=18,units="cm", res=400)

p4




pdf(paste0("Fig_4.pdf"), width=7, height=7)

p4

dev.off()
## png 
##   2
round(exp(coef(ml4g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.616   1.989   0.502   0.267   0.778   0.622   0.451
round(exp(coef(ml4g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.307   1.903   0.611   0.317   0.881   0.642   0.724

Estimated partworths by 임금피크제 도입여부

table(df$peak)
## 
##   0   1 
##  83 317
ml5g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$peak == 1, ])


summary(ml5g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$peak == 1, ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.28785 0.25237 0.24448 0.21530 
## 
## nr method
## 4 iterations, 0h:0m:1s 
## g'(-H)^-1g = 9.51E-07 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.340775   0.066309   5.1392 2.759e-07 ***
## length3  0.622835   0.063746   9.7705 < 2.2e-16 ***
## wage2   -0.677929   0.053153 -12.7544 < 2.2e-16 ***
## wage3   -1.415175   0.070613 -20.0413 < 2.2e-16 ***
## work2   -0.186414   0.058397  -3.1922  0.001412 ** 
## work3   -0.511131   0.061323  -8.3350 < 2.2e-16 ***
## none    -0.583400   0.074417  -7.8396 4.441e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -3116
ml5g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$peak == 0, ])
summary(ml5g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$peak == 0, ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29970 0.26506 0.26807 0.16717 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.000139 
## successive function values within tolerance limits 
## 
## Coefficients :
##         Estimate Std. Error z-value  Pr(>|z|)    
## length2  0.49670    0.12378  4.0129 5.999e-05 ***
## length3  0.80459    0.11793  6.8225 8.944e-12 ***
## wage2   -0.28297    0.10101 -2.8013  0.005090 ** 
## wage3   -0.68164    0.11562 -5.8955 3.735e-09 ***
## work2   -0.18719    0.10770 -1.7380  0.082209 .  
## work3   -0.29188    0.11168 -2.6136  0.008958 ** 
## none    -0.44964    0.15383 -2.9229  0.003468 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -853.12
# Prepare results for plotting group 1
tmp <- ml5g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.06630884 0.06374637 0.05315261 0.07061296 0.05839707 0.06132309 0.07441721
modelframe5a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe5a$term <- as.character(modelframe5a$term)
dwplot(modelframe5a)

# Prepare results for plotting group 2
tmp <- ml5g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##   length2   length3     wage2     wage3     work2     work3      none 
## 0.1237765 0.1179318 0.1010146 0.1156203 0.1077017 0.1116774 0.1538349
modelframe5b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe5b$term <- as.character(modelframe5b$term)
dwplot(modelframe5b)

# combine results for all three groups into one data frame for plotting

modelframe_allg5 <- rbind(modelframe5a, modelframe5b) # dataframe with outputs for all groups

modelframe_allg5$model <- rep(c("Yes", "No"), each = 7)

modelframe_allg5 <- modelframe_allg5[order(modelframe_allg5$model, decreasing = T), ]


modelframe_allg5$model <- factor (modelframe_allg5$model, 
                              levels = c("Yes", "No"), 
                              labels = c("Yes", "No"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p5 <- {dwplot( modelframe_allg5 , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c(
                "Yes", "No"
              ),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "peak", breaks=c("Yes", "No") ) + 
        scale_shape_discrete(name = "peak", breaks = c("Yes", "No") )  +
    guides(
      shape = guide_legend("peak"), 
      colour = guide_legend("peak"),
      size = guide_legend("peak")
    ) }   %>% 
  add_brackets(three_brackets)

p5

png("Fig_5.png", width=18, height=18,units="cm", res=400)

p5




pdf(paste0("Fig_5.pdf"), width=7, height=7)

p5

dev.off()
## png 
##   2
round(exp(coef(ml5g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.406   1.864   0.508   0.243   0.830   0.600   0.558
round(exp(coef(ml5g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.643   2.236   0.754   0.506   0.829   0.747   0.638

Estimated partworths by 연령대별

table(df$age)
## 
## 장년 중년 청년 
##  151  118  131
ml6g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$age == "청년", ])


summary(ml6g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$age == "청년", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29580 0.28721 0.25477 0.16221 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 5.83E-08 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.468677   0.101733   4.6069 4.087e-06 ***
## length3  0.767030   0.097633   7.8562 3.997e-15 ***
## wage2   -0.548681   0.079298  -6.9192 4.542e-12 ***
## wage3   -1.364954   0.106776 -12.7833 < 2.2e-16 ***
## work2   -0.224265   0.088483  -2.5345   0.01126 *  
## work3   -0.467895   0.092402  -5.0637 4.113e-07 ***
## none    -0.778378   0.121528  -6.4050 1.504e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1259.6
ml6g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$age == "중년", ])
summary(ml6g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$age == "중년", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.30085 0.25106 0.27542 0.17267 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.00165 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.356012   0.103055   3.4546 0.0005511 ***
## length3  0.597730   0.100440   5.9511 2.663e-09 ***
## wage2   -0.504397   0.084680  -5.9565 2.577e-09 ***
## wage3   -1.088724   0.105252 -10.3440 < 2.2e-16 ***
## work2   -0.192344   0.089557  -2.1477 0.0317363 *  
## work3   -0.644974   0.098756  -6.5310 6.534e-11 ***
## none    -0.795075   0.124482  -6.3871 1.691e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1174.7
ml6g3 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$age == "장년", ])
summary(ml6g3)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$age == "장년", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.27732 0.23013 0.22434 0.26821 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 3.39E-07 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.308557   0.098643   3.1280  0.001760 ** 
## length3  0.630213   0.093568   6.7354 1.635e-11 ***
## wage2   -0.722731   0.080645  -8.9619 < 2.2e-16 ***
## wage3   -1.255726   0.099362 -12.6379 < 2.2e-16 ***
## work2   -0.151152   0.088332  -1.7112  0.087047 .  
## work3   -0.290955   0.088995  -3.2693  0.001078 ** 
## none    -0.217729   0.106664  -2.0413  0.041224 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1523.3
# Prepare results for plotting group 1
tmp <- ml6g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.10173323 0.09763328 0.07929824 0.10677608 0.08848310 0.09240248 0.12152756
modelframe6a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe6a$term <- as.character(modelframe6a$term)
dwplot(modelframe6a)

# Prepare results for plotting group 2
tmp <- ml6g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.10305492 0.10043972 0.08467987 0.10525194 0.08955721 0.09875592 0.12448191
modelframe6b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe6b$term <- as.character(modelframe6b$term)
dwplot(modelframe6b)

# Prepare results for plotting group 3
tmp <- ml6g3

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.09864320 0.09356754 0.08064466 0.09936181 0.08833159 0.08899484 0.10666361
modelframe6c <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)


modelframe6c$term <- as.character(modelframe6c$term)
dwplot(modelframe6c)

# combine results for all three groups into one data frame for plotting

modelframe_allg6 <- rbind(modelframe6a, modelframe6b, modelframe6c) # dataframe with outputs for all groups

modelframe_allg6$model <- rep(c("2030s","40s","50s"), each = 7)

modelframe_allg6 <- modelframe_allg6[order(modelframe_allg6$model, decreasing = T), ]


modelframe_allg6$model <- factor (modelframe_allg6$model, 
                              levels = c("2030s","40s","50s"), 
                              labels = c("2030s","40s","50s"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p6 <- {dwplot( modelframe_allg6 , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c(
                "2030s","40s","50s"
              ),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "Age", breaks=c("2030s","40s","50s") ) + 
        scale_shape_discrete(name = "Age", breaks = c("2030s","40s","50s") )  +
    guides(
      shape = guide_legend("Age"), 
      colour = guide_legend("Age"),
      size = guide_legend("Age")
    ) }   %>% 
  add_brackets(three_brackets)

p6

png("Fig_6.png", width=18, height=18,units="cm", res=400)

p6



pdf(paste0("Fig_6.pdf"), width=7, height=7)

p6

dev.off()
## png 
##   2
round(exp(coef(ml6g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.598   2.153   0.578   0.255   0.799   0.626   0.459
round(exp(coef(ml6g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.428   1.818   0.604   0.337   0.825   0.525   0.452
round(exp(coef(ml6g3)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.361   1.878   0.485   0.285   0.860   0.748   0.804

Estimated partworths by 소득수준별

table(df$income)
## 
## 평균 이상이다. 평균 이하이다.      평균이다. 
##             95             65            240
ml7g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$income == "평균 이상이다.", ])


summary(ml7g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$income == "평균 이상이다.", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29474 0.27763 0.26184 0.16579 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.000421 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error z-value  Pr(>|z|)    
## length2  0.429433   0.116359  3.6906 0.0002237 ***
## length3  0.748315   0.110188  6.7912 1.112e-11 ***
## wage2   -0.433162   0.094563 -4.5807 4.634e-06 ***
## wage3   -0.880642   0.110766 -7.9505 1.776e-15 ***
## work2    0.042048   0.100871  0.4169 0.6767876    
## work3   -0.250938   0.107125 -2.3425 0.0191560 *  
## none    -0.507194   0.144071 -3.5205 0.0004308 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -963.88
ml7g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$income == "평균이다.", ])
summary(ml7g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$income == "평균이다.", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.29583 0.25312 0.24531 0.20573 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 2.06E-07 
## gradient close to zero 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.312277   0.074905   4.1690 3.060e-05 ***
## length3  0.542547   0.072369   7.4969 6.528e-14 ***
## wage2   -0.687096   0.060644 -11.3300 < 2.2e-16 ***
## wage3   -1.424996   0.080528 -17.6957 < 2.2e-16 ***
## work2   -0.233827   0.066922  -3.4940 0.0004758 ***
## work3   -0.505815   0.069435  -7.2847 3.222e-13 ***
## none    -0.702651   0.085148  -8.2521 2.220e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -2357.3
ml7g3 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$income == "평균 이하이다.", ])
summary(ml7g3)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$income == "평균 이하이다.", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.26346 0.22885 0.24615 0.26154 
## 
## nr method
## 5 iterations, 0h:0m:0s 
## g'(-H)^-1g = 3.78E-06 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error z-value  Pr(>|z|)    
## length2  0.571379   0.158376  3.6077 0.0003089 ***
## length3  1.017881   0.151733  6.7084 1.968e-11 ***
## wage2   -0.484640   0.121240 -3.9974 6.405e-05 ***
## wage3   -1.168216   0.156129 -7.4824 7.305e-14 ***
## work2   -0.405462   0.132923 -3.0503 0.0022858 ** 
## work3   -0.625607   0.140617 -4.4490 8.627e-06 ***
## none    -0.072418   0.169185 -0.4280 0.6686199    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -641.69
# Prepare results for plotting group 1
tmp <- ml6g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.10173323 0.09763328 0.07929824 0.10677608 0.08848310 0.09240248 0.12152756
modelframe7a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe7a$term <- as.character(modelframe7a$term)
dwplot(modelframe7a)

# Prepare results for plotting group 2
tmp <- ml7g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.07490525 0.07236942 0.06064414 0.08052780 0.06692240 0.06943479 0.08514776
modelframe7b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe7b$term <- as.character(modelframe7b$term)
dwplot(modelframe7b)

# Prepare results for plotting group 3
tmp <- ml7g3

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##   length2   length3     wage2     wage3     work2     work3      none 
## 0.1583763 0.1517333 0.1212402 0.1561294 0.1329234 0.1406172 0.1691848
modelframe7c <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)


modelframe7c$term <- as.character(modelframe7c$term)
dwplot(modelframe7c)

# combine results for all three groups into one data frame for plotting

modelframe_allg7 <- rbind(modelframe7a, modelframe7b, modelframe7c) # dataframe with outputs for all groups

modelframe_allg7$model <- rep(c("High","Middle","Low"), each = 7)

modelframe_allg7 <- modelframe_allg7[order(modelframe_allg7$model, decreasing = T), ]


modelframe_allg7$model <- factor (modelframe_allg7$model, 
                              levels = c("High","Middle","Low"), 
                              labels = c("High","Middle","Low"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p7 <- {dwplot( modelframe_allg7 , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c(
                "High","Middle","Low"
              ),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "Income", breaks=c("High","Middle","Low") ) + 
        scale_shape_discrete(name = "Income", breaks = c("High","Middle","Low") )  +
    guides(
      shape = guide_legend("Income"), 
      colour = guide_legend("Income"),
      size = guide_legend("Income")
    ) }   %>% 
  add_brackets(three_brackets)

p7

png("Fig_7.png", width=18, height=18,units="cm", res=400)

p7



pdf(paste0("Fig_7.pdf"), width=7, height=7)

p7

dev.off()
## png 
##   2
round(exp(coef(ml7g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.536   2.113   0.648   0.415   1.043   0.778   0.602
round(exp(coef(ml7g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.367   1.720   0.503   0.241   0.791   0.603   0.495
round(exp(coef(ml7g3)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.771   2.767   0.616   0.311   0.667   0.535   0.930

Estimated partworths by 안내문

table(df$experiment)
## 
## 고용연장   일자리 
##      224      176
ml8g1 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$experiment == "고용연장", ])


summary(ml8g1)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$experiment == "고용연장", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.28237 0.26395 0.27009 0.18359 
## 
## nr method
## 4 iterations, 0h:0m:0s 
## g'(-H)^-1g = 0.00456 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.431068   0.077499   5.5622 2.663e-08 ***
## length3  0.757893   0.073514  10.3094 < 2.2e-16 ***
## wage2   -0.504819   0.061734  -8.1773 2.220e-16 ***
## wage3   -1.097372   0.076422 -14.3593 < 2.2e-16 ***
## work2   -0.094517   0.067427  -1.4018     0.161    
## work3   -0.314867   0.070282  -4.4800 7.463e-06 ***
## none    -0.506352   0.091810  -5.5152 3.483e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -2245.9
ml8g2 <- mlogit(tmp_choice ~ length2 + length3 +
                wage2 + wage3 +
                work2 + work3 +
                none | 0, cbc[cbc$experiment == "일자리", ])
summary(ml8g2)
## 
## Call:
## mlogit(formula = tmp_choice ~ length2 + length3 + wage2 + wage3 + 
##     work2 + work3 + none | 0, data = cbc[cbc$experiment == "일자리", 
##     ], method = "nr")
## 
## Frequencies of alternatives:choice
##       1       2       3       4 
## 0.30043 0.24361 0.22301 0.23295 
## 
## nr method
## 5 iterations, 0h:0m:0s 
## g'(-H)^-1g = 2.36E-06 
## successive function values within tolerance limits 
## 
## Coefficients :
##          Estimate Std. Error  z-value  Pr(>|z|)    
## length2  0.304976   0.089007   3.4264 0.0006115 ***
## length3  0.531279   0.086745   6.1246 9.089e-10 ***
## wage2   -0.716891   0.072555  -9.8807 < 2.2e-16 ***
## wage3   -1.449748   0.096654 -14.9993 < 2.2e-16 ***
## work2   -0.312846   0.079214  -3.9494 7.836e-05 ***
## work3   -0.656698   0.083580  -7.8571 3.997e-15 ***
## none    -0.626759   0.098068  -6.3911 1.647e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -1725.8
# Prepare results for plotting group 1
tmp <- ml8g1

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.07749897 0.07351448 0.06173383 0.07642227 0.06742674 0.07028197 0.09180954
modelframe8a <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe8a$term <- as.character(modelframe8a$term)
dwplot(modelframe8a)

# Prepare results for plotting group 2
tmp <- ml8g2

covMatrix <- vcov(tmp)
sqrt(diag(covMatrix))
##    length2    length3      wage2      wage3      work2      work3       none 
## 0.08900669 0.08674453 0.07255476 0.09665411 0.07921413 0.08357968 0.09806804
modelframe8b <- data.frame(term = c(
  '2yr',
  '3yr',
  '70%',
  '60%',
  'transfer',
  'part_time',
  'No-choice option'),
  estimate = c(  tmp$coefficients[1], tmp$coefficients[2],
                 tmp$coefficients[3], tmp$coefficients[4],
                 tmp$coefficients[5], tmp$coefficients[6],
                 tmp$coefficients[7]),
  std.error = c( coef(summary(tmp))[1, "Std. Error"], coef(summary(tmp))[2, "Std. Error"], 
                 coef(summary(tmp))[3, "Std. Error"], coef(summary(tmp))[4, "Std. Error"], 
                 coef(summary(tmp))[5, "Std. Error"], coef(summary(tmp))[6, "Std. Error"], 
                 coef(summary(tmp))[7, "Std. Error"]
  )
)



modelframe8b$term <- as.character(modelframe8b$term)
dwplot(modelframe8b)

# combine results for all three groups into one data frame for plotting

modelframe_allg8 <- rbind(modelframe8a, modelframe8b) # dataframe with outputs for all groups

modelframe_allg8$model <- rep(c("Hire","Job"), each = 7)

modelframe_allg8 <- modelframe_allg8[order(modelframe_allg8$model, decreasing = T), ]


modelframe_allg8$model <- factor (modelframe_allg8$model, 
                              levels = c("Hire","Job"), 
                              labels = c("Hire","Job"))
  

three_brackets <- list(c("length\nRef: 1yr", 
                        "2yr", 
                        "3yr"), 
                      c("wage\nRef: 80%",
                        '70%',
                        '60%'),
                      c("work\nRef: reduction", 
                        'transfer',
                        'part-time')
) 

p8 <- {dwplot( modelframe_allg8 , 
              vline = geom_vline(xintercept = 0, colour = "grey60", linetype = 1),  
              model_order = c(
                "Hire","Job"
              ),
              dot_args = list(aes(shape = model) , size = 2.5  )  ) + 
    theme_bw( ) + xlab("Estimated partworth utilities") + ylab("") +  
        theme(plot.title = element_text(face="bold"),
        legend.position = "bottom",
        legend.title = element_blank()) + scale_color_grey(name = "Experiment", breaks=c("Hire","Job") ) + 
        scale_shape_discrete(name = "Experiment", breaks = c("Hire","Job") )  +
    guides(
      shape = guide_legend("Experiment"), 
      colour = guide_legend("Experiment"),
      size = guide_legend("Experiment")
    ) }   %>% 
  add_brackets(three_brackets)

p8

png("Fig_8.png", width=18, height=18,units="cm", res=400)

p8



pdf(paste0("Fig_8.pdf"), width=7, height=7)

p8

dev.off()
## png 
##   2
round(exp(coef(ml8g1)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.539   2.134   0.604   0.334   0.910   0.730   0.603
round(exp(coef(ml8g2)),3)
## length2 length3   wage2   wage3   work2   work3    none 
##   1.357   1.701   0.488   0.235   0.731   0.519   0.534

5. 정책프레임과 필요성 및 효과성에 대한 실험연구

5.1. 참고자료

노인일자리 사업의 필요성 및 효과성에 대한 실험연구: 정책 프레이밍과 연령대의 이원상호작용 효과(오주현 등, 2022)

5.2. 탐색적 요인분석(위 논문 표1)

크론바흐알파

효과성

psych::alpha(df[c("effect1", 
                       "effect2",
                       "effect3", 
                       "effect4",
                       "effect5",
                        "need1") ],
             check.keys=TRUE)
## 
## Reliability analysis   
## Call: psych::alpha(x = df[c("effect1", "effect2", "effect3", "effect4", 
##     "effect5", "need1")], check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.96      0.96    0.96      0.81  26 0.003  2.3 0.99      0.8
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.96  0.96  0.97
## Duhachek  0.96  0.96  0.97
## 
##  Reliability if an item is dropped:
##         raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## effect1      0.95      0.95    0.95      0.81  21   0.0036 0.0014  0.80
## effect2      0.95      0.95    0.95      0.81  21   0.0036 0.0014  0.81
## effect3      0.95      0.95    0.95      0.80  20   0.0038 0.0011  0.80
## effect4      0.95      0.95    0.95      0.80  20   0.0038 0.0012  0.79
## effect5      0.96      0.96    0.96      0.82  23   0.0033 0.0015  0.81
## need1        0.96      0.96    0.95      0.82  23   0.0034 0.0018  0.81
## 
##  Item statistics 
##           n raw.r std.r r.cor r.drop mean  sd
## effect1 400  0.92  0.92  0.91   0.89  2.3 1.1
## effect2 400  0.92  0.92  0.90   0.88  2.3 1.0
## effect3 400  0.93  0.93  0.92   0.90  2.3 1.1
## effect4 400  0.93  0.93  0.92   0.90  2.3 1.1
## effect5 400  0.89  0.89  0.86   0.85  2.2 1.1
## need1   400  0.90  0.90  0.87   0.86  2.3 1.1
## 
## Non missing response frequency for each item
##            1    2    3    4    5 miss
## effect1 0.29 0.27 0.30 0.12 0.02    0
## effect2 0.28 0.32 0.27 0.12 0.01    0
## effect3 0.28 0.28 0.30 0.11 0.02    0
## effect4 0.30 0.26 0.30 0.12 0.02    0
## effect5 0.31 0.28 0.28 0.11 0.02    0
## need1   0.30 0.24 0.29 0.14 0.03    0
# generate mean of all items 
df$effect <- apply(df[c("effect1", 
                       "effect2",
                       "effect3", 
                       "effect4",
                       "effect5",
                       "need1") ], 1, mean, na.rm=TRUE)

hist(df$effect)

필요성

psych::alpha(df[c("need2",
                       "need3", 
                       "need4",
                       "need5") ],
             check.keys=TRUE)
## 
## Reliability analysis   
## Call: psych::alpha(x = df[c("need2", "need3", "need4", "need5")], check.keys = TRUE)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean  sd median_r
##       0.95      0.95    0.94      0.84  20 0.0039  2.6 1.2     0.83
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.94  0.95  0.96
## Duhachek  0.95  0.95  0.96
## 
##  Reliability if an item is dropped:
##       raw_alpha std.alpha G6(smc) average_r S/N alpha se   var.r med.r
## need2      0.93      0.93    0.90      0.82  14   0.0059 0.00069  0.82
## need3      0.93      0.93    0.90      0.82  14   0.0060 0.00090  0.81
## need4      0.94      0.94    0.92      0.84  16   0.0053 0.00186  0.82
## need5      0.95      0.95    0.93      0.86  19   0.0044 0.00049  0.85
## 
##  Item statistics 
##         n raw.r std.r r.cor r.drop mean  sd
## need2 400  0.95  0.95  0.93   0.91  2.6 1.3
## need3 400  0.95  0.95  0.93   0.91  2.6 1.3
## need4 400  0.93  0.93  0.90   0.88  2.6 1.3
## need5 400  0.91  0.91  0.87   0.85  2.6 1.2
## 
## Non missing response frequency for each item
##          1    2    3    4    5 miss
## need2 0.29 0.18 0.25 0.23 0.05    0
## need3 0.30 0.17 0.25 0.22 0.06    0
## need4 0.30 0.17 0.26 0.20 0.07    0
## need5 0.28 0.19 0.26 0.22 0.05    0
df$need <- apply(df[c("need2",
                       "need3", 
                       "need4",
                       "need5") ], 1, mean, na.rm=TRUE)

hist(df$need)

2. 기초통계분석

2.1. 기초분석(표본의 특성)

  • 아래 그림의 셀에 값을 구합니다.
fig1
fig1

2.1.1. 남여

library(gmodels)
CrossTable(df$gender)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           |      남성 |      여성 | 
##           |-----------|-----------|
##           |       308 |        92 | 
##           |     0.770 |     0.230 | 
##           |-----------|-----------|
## 
## 
## 
## 

2.1.2. 소속

CrossTable(df$affiliation)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           |    공기업 |    사기업 | 
##           |-----------|-----------|
##           |       200 |       200 | 
##           |     0.500 |     0.500 | 
##           |-----------|-----------|
## 
## 
## 
## 

2.1.3. 연령

CrossTable(df$age)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           |      장년 |      중년 |      청년 | 
##           |-----------|-----------|-----------|
##           |       151 |       118 |       131 | 
##           |     0.378 |     0.295 |     0.328 | 
##           |-----------|-----------|-----------|
## 
## 
## 
## 

2.1.4. 소득

CrossTable(df$income)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           | 평균 이상이다. | 평균 이하이다. |      평균이다. | 
##           |-----------|-----------|-----------|
##           |        95 |        65 |       240 | 
##           |     0.237 |     0.163 |     0.600 | 
##           |-----------|-----------|-----------|
## 
## 
## 
## 

2.1.5. 임금피크제 도입여부

CrossTable(df$peak)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           |         0 |         1 | 
##           |-----------|-----------|
##           |        83 |       317 | 
##           |     0.207 |     0.792 | 
##           |-----------|-----------|
## 
## 
## 
## 

2.1.6. 효과성필요성 군집그룹(상중하) 1: 하, 2: 중, 3:상

CrossTable(df$kgroup)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           |         1 |         2 |         3 | 
##           |-----------|-----------|-----------|
##           |        75 |       163 |       162 | 
##           |     0.188 |     0.407 |     0.405 | 
##           |-----------|-----------|-----------|
## 
## 
## 
## 

2.1.7. 안내문

CrossTable(df$experiment)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  400 
## 
##  
##           |  고용연장 |    일자리 | 
##           |-----------|-----------|
##           |       224 |       176 | 
##           |     0.560 |     0.440 | 
##           |-----------|-----------|
## 
## 
## 
## 

3. 표본의 특성

fig2 ## 3.1. 정책효과성

3.1.1. 남여

df %>% 
  group_by(gender) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

t검증(남여별 차이)

t.test(df$effect ~ df$gender)
## 
##  Welch Two Sample t-test
## 
## data:  df$effect by df$gender
## t = -3.3624, df = 152.78, p-value = 0.0009766
## alternative hypothesis: true difference in means between group 남성 and group 여성 is not equal to 0
## 95 percent confidence interval:
##  -0.6088576 -0.1581760
## sample estimates:
## mean in group 남성 mean in group 여성 
##           2.208874           2.592391

3.1.2. 소속

df %>% 
  group_by(affiliation) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

F검증

res.aov <- aov(effect ~ affiliation, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value Pr(>F)  
## affiliation   1    5.3   5.252   5.466 0.0199 *
## Residuals   398  382.4   0.961                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.1.3. 연령

df %>% 
  group_by(age) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

F검증

res.aov <- aov(effect ~ age, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## age           2   15.8   7.906    8.44 0.000257 ***
## Residuals   397  371.9   0.937                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.1.4. 소득

df %>% 
  group_by(income) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

F검증

res.aov <- aov(effect ~ income, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value Pr(>F)
## income        2    0.4  0.2159   0.221  0.802
## Residuals   397  387.2  0.9754

3.1.5. 임금피크제 도입여부

df %>% 
  group_by(peak) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

t검증(도입여부 차이)

t.test(df$effect ~ df$peak)
## 
##  Welch Two Sample t-test
## 
## data:  df$effect by df$peak
## t = 2.1049, df = 160.55, p-value = 0.03685
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.01363054 0.42751575
## sample estimates:
## mean in group 0 mean in group 1 
##        2.471888        2.251314

3.1.6. 효과성필요성 군집그룹(상중하) 1: 하, 2: 중, 3:상

df %>% 
  group_by(kgroup) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

F검증

res.aov <- aov(effect ~ kgroup, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value Pr(>F)    
## kgroup        1  275.6  275.65   979.4 <2e-16 ***
## Residuals   398  112.0    0.28                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.1.7. 안내문

df %>% 
  group_by(experiment) %>%
  summarise(count = n(),
      mean = mean(effect),
      sd=sd(effect))

t검증

t.test(df$effect ~ df$experiment)
## 
##  Welch Two Sample t-test
## 
## data:  df$effect by df$experiment
## t = 2.5658, df = 383.81, p-value = 0.01067
## alternative hypothesis: true difference in means between group 고용연장 and group 일자리 is not equal to 0
## 95 percent confidence interval:
##  0.05877266 0.44420353
## sample estimates:
## mean in group 고용연장   mean in group 일자리 
##               2.407738               2.156250

3.2. 정책필요성 차이분석

3.2.1. 남여

df %>% 
  group_by(gender) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

t검증(남여별 차이)

t.test(df$need ~ df$gender)
## 
##  Welch Two Sample t-test
## 
## data:  df$need by df$gender
## t = -1.6426, df = 164.8, p-value = 0.1024
## alternative hypothesis: true difference in means between group 남성 and group 여성 is not equal to 0
## 95 percent confidence interval:
##  -0.4788555  0.0439317
## sample estimates:
## mean in group 남성 mean in group 여성 
##           2.516234           2.733696

3.2.2. 소속

df %>% 
  group_by(affiliation) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

F검증

res.aov <- aov(need ~ affiliation, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## affiliation   1   18.3  18.276   13.41 0.000284 ***
## Residuals   398  542.3   1.363                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.2.3. 연령

df %>% 
  group_by(age) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

F검증

res.aov <- aov(need ~ age, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value  Pr(>F)   
## age           2   14.3   7.150   5.196 0.00592 **
## Residuals   397  546.3   1.376                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.2.4. 소득

df %>% 
  group_by(income) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

F검증

res.aov <- aov(need ~ income, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value Pr(>F)
## income        2    1.3  0.6458   0.458  0.633
## Residuals   397  559.3  1.4089

3.2.5. 임금피크제 도입여부

df %>% 
  group_by(peak) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

t검증(도입여부 차이)

t.test(df$need ~ df$peak)
## 
##  Welch Two Sample t-test
## 
## data:  df$need by df$peak
## t = 2.7024, df = 137.74, p-value = 0.007751
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.09993683 0.64503676
## sample estimates:
## mean in group 0 mean in group 1 
##        2.861446        2.488959

3.2.6. 효과성필요성 군집그룹(상중하) 1: 하, 2: 중, 3:상

df %>% 
  group_by(kgroup) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

F검증

res.aov <- aov(need ~ kgroup, data = df)
summary(res.aov)
##              Df Sum Sq Mean Sq F value Pr(>F)    
## kgroup        1  457.2   457.2    1760 <2e-16 ***
## Residuals   398  103.4     0.3                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.2.7. 안내문

df %>% 
  group_by(experiment) %>%
  summarise(count = n(),
      mean = mean(need),
      sd=sd(need))

t검증

t.test(df$need ~ df$experiment)
## 
##  Welch Two Sample t-test
## 
## data:  df$need by df$experiment
## t = 1.5565, df = 384.76, p-value = 0.1204
## alternative hypothesis: true difference in means between group 고용연장 and group 일자리 is not equal to 0
## 95 percent confidence interval:
##  -0.04849415  0.41700064
## sample estimates:
## mean in group 고용연장   mean in group 일자리 
##               2.647321               2.463068

4. 실험연구 분석

4.1.Descriptive statistics

table(df$affiliation, df$experiment)
##         
##          고용연장 일자리
##   공기업      124     76
##   사기업      100    100
# mean by group
aggregate(need ~ affiliation,
  data = df,
  FUN = mean
)
aggregate(need ~ affiliation + experiment,
  data = df,
  FUN = mean
)
# mean by group
aggregate(effect ~ affiliation,
  data = df,
  FUN = mean
)
library(dplyr)

group_by(df, experiment, affiliation) %>%
  summarise(
    mean = mean(need, na.rm = TRUE),
    sd = sd(need, na.rm = TRUE),
  )
## `summarise()` has grouped output by 'experiment'. You can override using the
## `.groups` argument.
library(ggplot2)

ggplot(df) +
  aes(x = experiment, y = need, fill = affiliation) +
  geom_boxplot()

# Two-way ANOVA with interaction
# save model
mod <- aov(need ~ experiment * affiliation,
  data = df
)

# print results
summary(mod)
##                         Df Sum Sq Mean Sq F value   Pr(>F)    
## experiment               1    3.3   3.346   2.521  0.11311    
## affiliation              1   20.5  20.515  15.459 9.95e-05 ***
## experiment:affiliation   1   11.2  11.236   8.467  0.00382 ** 
## Residuals              396  525.5   1.327                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# method 2
library(ggpubr)

ggline(subset(df, !is.na(experiment)), # remove NA level for sex
  x = "experiment",
  y = "need",
  line = "affiliation",
  add = c("mean_se") # add mean and standard error
) +
  labs(y = "NEED")

ggsave("need.png")
## Saving 7 x 5 in image
with(
  df,
  interaction.plot(experiment, affiliation, need)
)

mod2 <- lm(need ~ experiment + affiliation,
  data = df
)

summary(mod2)
## 
## Call:
## lm.default(formula = need ~ experiment + affiliation, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8999 -1.1685  0.1001  0.8501  2.7962 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        2.44363    0.09365  26.093  < 2e-16 ***
## experiment일자리  -0.23981    0.11799  -2.032 0.042773 *  
## affiliation사기업  0.45628    0.11714   3.895 0.000115 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.163 on 397 degrees of freedom
## Multiple R-squared:  0.04256,    Adjusted R-squared:  0.03774 
## F-statistic: 8.824 on 2 and 397 DF,  p-value: 0.000178
mod3 <- lm(effect ~ experiment + affiliation,
  data = df
)

summary(mod3)
## 
## Call:
## lm.default(formula = effect ~ experiment + affiliation, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.55343 -0.93657 -0.05343  0.70976  2.73010 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        2.29024    0.07824  29.273  < 2e-16 ***
## experiment일자리  -0.28353    0.09857  -2.876  0.00424 ** 
## affiliation사기업  0.26319    0.09786   2.690  0.00746 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9714 on 397 degrees of freedom
## Multiple R-squared:  0.03369,    Adjusted R-squared:  0.02882 
## F-statistic:  6.92 on 2 and 397 DF,  p-value: 0.001112
# Two-way ANOVA with interaction
# save model
mod4<- aov(effect ~ experiment * affiliation,
  data = df
)

# print results
summary(mod4)
##                         Df Sum Sq Mean Sq F value  Pr(>F)   
## experiment               1    6.2   6.234    6.63 0.01039 * 
## affiliation              1    6.8   6.826    7.26 0.00735 **
## experiment:affiliation   1    2.3   2.313    2.46 0.11759   
## Residuals              396  372.3   0.940                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# method 2
library(ggpubr)

ggline(subset(df, !is.na(experiment)), # remove NA level for sex
  x = "experiment",
  y = "effect",
  line = "affiliation",
  add = c("mean_se") # add mean and standard error
) +
  labs(y = "Effect")

ggsave("effect.png")
## Saving 7 x 5 in image
library(ggplot2)

ggplot(df) +
  aes(x = experiment, y = effect, fill = affiliation) +
  geom_boxplot()