In economics, there is a strong history trying to understand market behavior. Properly identifying a market demand function is hard, but various authors have studied fish markets as examples of markets where a relatively homogeneous good (fish) with a short lifespan (its quality deteriorates if not bought promptly) is sold in competitive regional markets, such as the fish markets at Saumaty in Marseille, France or MERITAN in Ancona, Italy. Alan Kirman has been involved in a lot of this work in the past few decades trying to understand the complex interactions that produce market demand functions from not particularly consisten individual demand functions. Härdle and Kirman (1995) was one of his first papers to study the Marseille fish market, which he returned to for his book (Kirman 2010). He has also done a variety of work with co-authors on fish markets, such as that on the Ancona fish market (Gallegati et al. 2011). Prof Kirman has kindly provided me with the data from his Marseille studies and so we have the opportunity to look at this rather marvellous data set.
library(tidyverse)
Import the data as follows. Notice that this is different to how we’ve historically imported data. This is because the data are in a .txt file and so we can’t use read.csv()
. Instead, we use read_table()
from the readr
package in the tidyverse
.
MarseilleBase <- read_table("../more/MARS.txt", col_names = FALSE)
## Parsed with column specification:
## cols(
## X1 = col_character(),
## X2 = col_double(),
## X3 = col_character(),
## X4 = col_double(),
## X5 = col_character(),
## X6 = col_character(),
## X7 = col_double(),
## X8 = col_double()
## )
Look at the .txt file. Why do you think we had to specify the option col_names = FALSE
?
head()
the data to look at it. It has an issue in one column (find the issue)
head(MarseilleBase)
## # A tibble: 6 x 8
## X1 X2 X3 X4 X5 X6 X7 X8
## <chr> <dbl> <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 002 18064 ESCALE BASTIA 1002 880105 05415 077 110 4000
## 2 002 18070 ESCALE BASTIA 1002 880105 05415 044 120 2200
## 3 002 18078 PSS DE L'OPERA1002 880104 05505 214 90 6500
## 4 002 18081 GENTY PERTUIS 1002 880105 10080 214 150 6500
## 5 002 18085 MONTLAUR 1002 880102 10090 077 205 3600
## 6 002 18088 MONTLAUR 1002 880102 10090 044 60 1800
As you can see, we need to solve this problem. We can do this using the separate()
function in dplyr
, which is part of the tidyverse
.
Marseille <-
MarseilleBase %>%
separate(X3, c("X3a", "X3b"), sep = 14)
head(Marseille)
## # A tibble: 6 x 9
## X1 X2 X3a X3b X4 X5 X6 X7 X8
## <chr> <dbl> <chr> <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 002 18064 "ESCALE BASTIA " 1002 880105 05415 077 110 4000
## 2 002 18070 "ESCALE BASTIA " 1002 880105 05415 044 120 2200
## 3 002 18078 PSS DE L'OPERA 1002 880104 05505 214 90 6500
## 4 002 18081 "GENTY PERTUIS " 1002 880105 10080 214 150 6500
## 5 002 18085 "MONTLAUR " 1002 880102 10090 077 205 3600
## 6 002 18088 "MONTLAUR " 1002 880102 10090 044 60 1800
What did I do above?
Marseille
MarseilleBase
separate
on the variable X3
separate
to create two new variables: X3a
and X3b
separate
to separate the variables at the 14th characterMarseille <-
Marseille %>%
rename(sellerb = X1,
lot = X2,
nbuyer = X3a,
seller = X3b,
date = X4,
buyerc = X5,
fish = X6,
tempa = X7,
tempb = X8)
Mars <-
Marseille %>%
mutate(weight = tempa/10,
price = tempb/100,
fish = parse_number(fish))
head(Mars)
## # A tibble: 6 x 11
## sellerb lot nbuyer seller date buyerc fish tempa tempb weight price
## <chr> <dbl> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 002 18064 "ESCAL… 1002 880105 05415 77 110 4000 11 40
## 2 002 18070 "ESCAL… 1002 880105 05415 44 120 2200 12 22
## 3 002 18078 PSS DE… 1002 880104 05505 214 90 6500 9 65
## 4 002 18081 "GENTY… 1002 880105 10080 214 150 6500 15 65
## 5 002 18085 "MONTL… 1002 880102 10090 77 205 3600 20.5 36
## 6 002 18088 "MONTL… 1002 880102 10090 44 60 1800 6 18
Mars %>%
ggplot(aes(x = weight, y = price)) +
geom_point() +
stat_smooth(method = "lm", formula = y ~ x, size = 1) +
ylim(0, 100)
Note, we don’t need a linear downward-sloping function…
Mars %>%
ggplot(aes(x = weight, y = price)) +
geom_point() +
stat_smooth(method = "auto", size = 1) +
ylim(0, 100)
Notice that you can run a linear regression with the whole sample of fish:
m1 <- lm(price ~ weight, data = Mars)
summary(m1)
##
## Call:
## lm(formula = price ~ weight, data = Mars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -52.63 -20.45 -8.68 15.94 616.91
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 54.365659 0.065227 833.49 <2e-16 ***
## weight -0.123012 0.001865 -65.97 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 29.56 on 237159 degrees of freedom
## Multiple R-squared: 0.01802, Adjusted R-squared: 0.01802
## F-statistic: 4353 on 1 and 237159 DF, p-value: < 2.2e-16
Consult the Marseille Data Description document to check what the different kinds of fish are.
I have generated the following graph for one of the kinds of fish (Merlan) by filtering and then graphing the data.
Filtering for Merlan:
MarsFish44 <-
Mars %>%
filter(fish == 44)
Graphing the Merlan data:
MarsFish44 %>%
ggplot(aes(x = weight, y = price)) +
geom_point() +
stat_smooth(method = "lm", formula = y ~ x, size = 1) +
ylim(0, 100)
We can also run a linear regression just with the filtered Merlan data.
OLSMerlan <- lm(price ~ weight, data = MarsFish44)
summary(OLSMerlan)
##
## Call:
## lm(formula = price ~ weight, data = MarsFish44)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.37 -8.03 0.45 7.45 345.26
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.461832 0.060521 552.90 <2e-16 ***
## weight -0.181625 0.003097 -58.64 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.08 on 53679 degrees of freedom
## Multiple R-squared: 0.06021, Adjusted R-squared: 0.06019
## F-statistic: 3439 on 1 and 53679 DF, p-value: < 2.2e-16
Do the following:
We want to include dummy variables for the different kinds of fish in our regressions. So we need to check whether the fish variable is numeric or a factor. We could head()
the data and see, but we can also get a “TRUE” or “FALSE” report using the is.numeric()
or is.factor()
functions.
is.numeric(Mars$fish) #This must be true because I made it numeric
## [1] TRUE
is.factor(Mars$fish) #This must be false unless you change it
## [1] FALSE
As you can see, we need to change fish
into a factor variable because we want to use the factors as dummy variables.
We need to take the information from the Marseille Data Details document and create two sets of columns using the following idea: ObjectName <- c(content1, content2, ..., contentn)
.
I would recommend you call the one object (with the numbers) FishLevels
because it corresponds to the levels of your factor variable.
I would recommend you call the second object FishLabels
because it corresponds to the labels of the fish that you want.
You will then use the function factor()
inside the function mutate()
(which you already know).
You would then do something like the following (note this won’t work because I don’t have actual values attributed to these objects yet).
MarsFactorFish <-
Mars %>%
mutate(fishvariablename = factor(fishvariablename,
levels = FishLevels,
labels = FishLabels))
You should get something like the following when you head your data:
FishLabels <- c("Merlan", "Carrelet", "Cabillaud", "Loup/Bar", " Raie Ange", "Raie (Entiere)", "Sole", "Turbot", "Cabillaud (Tranches)", "Sole (Filet)", "Annarrhyque (Filets)", "Loup (Filets)")
FishLevels <- c(44, 62, 77, 122, 204, 205, 214, 244, 285, 289, 314, 315)
Mars <-
Mars %>%
mutate(fish = factor(fish,
levels = FishLevels,
labels = FishLabels))
head(Mars)
## # A tibble: 6 x 11
## sellerb lot nbuyer seller date buyerc fish tempa tempb weight price
## <chr> <dbl> <chr> <chr> <dbl> <chr> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 002 18064 "ESCAL… 1002 880105 05415 Cabi… 110 4000 11 40
## 2 002 18070 "ESCAL… 1002 880105 05415 Merl… 120 2200 12 22
## 3 002 18078 PSS DE… 1002 880104 05505 Sole 90 6500 9 65
## 4 002 18081 "GENTY… 1002 880105 10080 Sole 150 6500 15 65
## 5 002 18085 "MONTL… 1002 880102 10090 Cabi… 205 3600 20.5 36
## 6 002 18088 "MONTL… 1002 880102 10090 Merl… 60 1800 6 18
Now, include the fish variables as factors in our regression
m2 <- lm(price ~ weight + fish, data = Mars)
summary(m2)
##
## Call:
## lm(formula = price ~ weight + fish, data = Mars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -108.07 -7.61 -0.01 8.19 559.08
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 32.373332 0.070891 456.665 < 2e-16 ***
## weight -0.071872 0.001057 -67.992 < 2e-16 ***
## fishCarrelet -15.228267 0.416955 -36.523 < 2e-16 ***
## fishCabillaud -0.762175 0.111204 -6.854 7.21e-12 ***
## fishLoup/Bar 79.294592 0.126323 627.712 < 2e-16 ***
## fish Raie Ange 4.903407 0.128050 38.293 < 2e-16 ***
## fishRaie (Entiere) 19.105380 3.316580 5.761 8.39e-09 ***
## fishSole 33.302315 0.090079 369.702 < 2e-16 ***
## fishTurbot 40.010028 0.247857 161.424 < 2e-16 ***
## fishCabillaud (Tranches) 9.517823 0.630584 15.094 < 2e-16 ***
## fishSole (Filet) 46.448343 0.683419 67.965 < 2e-16 ***
## fishAnnarrhyque (Filets) 12.531891 0.299675 41.818 < 2e-16 ***
## fishLoup (Filets) 13.706294 0.208501 65.737 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.24 on 237042 degrees of freedom
## (106 observations deleted due to missingness)
## Multiple R-squared: 0.7036, Adjusted R-squared: 0.7036
## F-statistic: 4.689e+04 on 12 and 237042 DF, p-value: < 2.2e-16
m3 <- lm(price ~ weight + fish + weight*fish, data = Mars)
summary(m3)
##
## Call:
## lm(formula = price ~ weight + fish + weight * fish, data = Mars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -108.88 -7.86 -0.03 8.04 559.41
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.461832 0.080694 414.676 < 2e-16 ***
## weight -0.181625 0.004130 -43.982 < 2e-16 ***
## fishCarrelet -16.045520 0.509098 -31.518 < 2e-16 ***
## fishCabillaud -2.939527 0.122115 -24.072 < 2e-16 ***
## fishLoup/Bar 79.198454 0.148148 534.591 < 2e-16 ***
## fish Raie Ange 4.262459 0.149862 28.443 < 2e-16 ***
## fishRaie (Entiere) 35.941919 5.033931 7.140 9.36e-13 ***
## fishSole 33.210235 0.101279 327.908 < 2e-16 ***
## fishTurbot 40.505680 0.298213 135.828 < 2e-16 ***
## fishCabillaud (Tranches) 8.618767 0.969096 8.894 < 2e-16 ***
## fishSole (Filet) 46.469263 0.737629 62.998 < 2e-16 ***
## fishAnnarrhyque (Filets) 9.922578 0.332157 29.873 < 2e-16 ***
## fishLoup (Filets) 11.438594 0.224946 50.850 < 2e-16 ***
## weight:fishCarrelet 0.085692 0.026419 3.244 0.00118 **
## weight:fishCabillaud 0.156957 0.004408 35.610 < 2e-16 ***
## weight:fishLoup/Bar -0.017133 0.009590 -1.787 0.07400 .
## weight:fish Raie Ange 0.037671 0.011757 3.204 0.00135 **
## weight:fishRaie (Entiere) -4.018855 0.877712 -4.579 4.68e-06 ***
## weight:fishSole 0.019803 0.004686 4.226 2.38e-05 ***
## weight:fishTurbot -0.033145 0.015344 -2.160 0.03076 *
## weight:fishCabillaud (Tranches) 0.098372 0.044602 2.206 0.02742 *
## weight:fishSole (Filet) -0.086347 0.051126 -1.689 0.09124 .
## weight:fishAnnarrhyque (Filets) 0.149352 0.005607 26.634 < 2e-16 ***
## weight:fishLoup (Filets) 0.150366 0.004985 30.164 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.11 on 237031 degrees of freedom
## (106 observations deleted due to missingness)
## Multiple R-squared: 0.7085, Adjusted R-squared: 0.7085
## F-statistic: 2.505e+04 on 23 and 237031 DF, p-value: < 2.2e-16
Gallegati, Mauro, Gianfranco Giulioni, Alan Kirman, and Antonio Palestrini. 2011. “What’s That Got to Do with the Price of Fish? Buyers Behavior on the Ancona Fish Market.” Journal of Economic Behavior & Organization 80 (1): 20–33. doi:https://doi.org/10.1016/j.jebo.2011.01.011.
Härdle, Wolfgang, and Alan Kirman. 1995. “Nonclassical Demand: A Model-Free Examination of Price-Quantity Relations in the Marseille Fish Market.” Journal of Econometrics 67 (1): 227–57. doi:https://doi.org/10.1016/0304-4076(94)01634-C.
Kirman, Alan. 2010. Complex Economics: Individual and Collective Rationality. Routledge.