Bogaert, Patrick
[UCL]
Gengler, Sarah
[UCL]
Categorical data play an important role in a wide variety of spatial applications, while modeling and predicting this type of statistical variable has proved to be complex in many cases. Among other possible approaches, the Bayesian maximum entropy methodology has been developed and advocated for this goal and has been successfully applied in various spatial prediction problems. This approach aims at building a multivariate probability table from bivariate probability functions used as constraints that need to be fulfilled, in order to compute a posterior conditional distribution that accounts for hard or soft information sources. In this paper, our goal is to generalize further the theoretical results in order to account for a much wider type of information source, such as probability inequalities. We first show how the maximum entropy principle can be implemented efficiently using a linear iterative approximation based on a minimum norm criterion, where the minimum norm solution is obtained at each step from simple matrix operations that converges to the requested maximum entropy solution. Based on this result, we show then how the maximum entropy problem can be related to the more general minimum divergence problem, which might involve equality and inequality constraints and which can be solved based on iterated minimum norm solutions. This allows us to account for a much larger panel of information types, where more qualitative information, such as probability inequalities can be used. When combined with a Bayesian data fusion approach, this approach deals with the case of potentially conflicting information that is available. Although the theoretical results presented in this paper can be applied to any study (spatial or non-spatial) involving categorical data in general, the results are illustrated in a spatial context where the goal is to predict at best the occurrence of cultivated land in Ethiopia based on crowdsourced information. The results emphasize the benefit of the methodology, which integrates conflicting information and provides a spatially exhaustive map of these occurrence classes over the whole country.
- Abramov Rafail, A practical computational framework for the multidimensional moment-constrained maximum entropy principle, 10.1016/j.jcp.2005.05.008
- Abramov Rafail V., The multidimensional maximum entropy moment problem: a review of numerical methods, 10.4310/cms.2010.v8.n2.a5
- Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, Hoboken
- Ali Ahmed Loai, Schmid Falko, Al-Salman Rami, Kauppinen Tomi, Ambiguity and plausibility : managing classification quality in volunteered geographic information, 10.1145/2666310.2666392
- Allard D., D'Or D., Froidevaux R., An efficient maximum entropy approach for categorical variable prediction, 10.1111/j.1365-2389.2011.01362.x
- Andersen EB (1980) Discrete statistical models with social science applications. North Holland, Amsterdam
- Bandyopadhyay K., Bhattacharya A. K., Biswas Parthapratim, Drabold D. A., Maximum entropy and the problem of moments: A stable algorithm, 10.1103/physreve.71.057701
- Bayat Bardia, Nasseri Mohsen, Zahraie Banafsheh, Identification of long-term annual pattern of meteorological drought based on spatiotemporal methods: evaluation of different geostatistical approaches, 10.1007/s11069-014-1499-3
- BIERKENS M. F. P., BURROUGH P. A., The indicator approach to categorical soil data : I. Theory, 10.1111/j.1365-2389.1993.tb00458.x
- Bishop YMM, Fienberg SE, Holland PW (2007) Discrete multivariate analysis: theory and practice. Springer, Berlin
- BMELib : a MATLAB numerical toolbox of modern spatiotemporal geostatistics implementing the Bayesian maximum entropy theory.
http://www.unc.edu/depts/case/BMElab/
- Bogaert P., Spatial prediction of categorical variables: the Bayesian maximum entropy approach, 10.1007/s00477-002-0114-4
- Bogaert P, Gengler S (2014) MinNorm approximation of MaxEnt/MinDiv problems for probability tables. In MaxEnt 2014—Bayesian inference and maximum entropy methods in science and engineering, Amboise, France, 21–26 September 2014, pp 287–296
- Brus D. J., Bogaert P., Heuvelink G. B. M., Bayesian Maximum Entropy prediction of soil categories using a traditional soil map as soft information, 10.1111/j.1365-2389.2007.00981.x
- Canosa N, Miller H.G, Plastino A, Rossignoli R, Maximum Entropy-Minimum Norm method for the determination of level densities, 10.1016/0378-4371(95)00212-p
- Cao C, Kyriakidis PC, Goodchild MF (2011) A multinomial logistic mixed model for the prediction of categorical spatial data. Int J Geogr Inf Sci 25(12):2017–2086
- Cao Guofeng, Yoo Eun-hye, Wang Shaowen, A statistical framework of data fusion for spatial prediction of categorical variables, 10.1007/s00477-013-0842-7
- Cardille Jeffrey A., Clayton Murray K., A regression tree-based method for integrating land-cover and land-use data collected at multiple scales, 10.1007/s10651-007-0012-5
- Christakos G (2000) Modern spatiotemporal geostatistics. Oxford University Press, Oxford
- Christakos G, Bogaert P, Serre M (2002) Temporal geographical information systems: advanced functions for field-based applications. Springer, Berlin
- Christensen R (1997) Log-linear models and logistic regression, 2nd edn. Springer, Berlin
- Comber Alexis, See Linda, Fritz Steffen, Van der Velde Marijn, Perger Christoph, Foody Giles, Using control data to determine the reliability of volunteered geographic information about land cover, 10.1016/j.jag.2012.11.002
- Comber A., Mooney P., Purves R. S., Rocchini D., Walz A., COMPARING NATIONAL DIFFERENCES IN WHAT PEOPLE PERCEIVE TO BE THERE: MAPPING VARIATIONS IN CROWD SOURCED LAND COVER, 10.5194/isprsarchives-xl-3-w3-71-2015
- Comber Alexis, Fonte Cidália, Foody Giles, Fritz Steffen, Harris Paul, Olteanu-Raimond Ana-Maria, See Linda, Geographically weighted evidence combination approaches for combining discordant and inconsistent volunteered geographical information, 10.1007/s10707-016-0248-z
- Cressie N (2015) Statistics for spatial data, 2nd edn. Wiley-Interscience, Hoboken
- Cressie N, Wikle CK (2011) Statistics for spatial-temporal Data. Wiley, Hoboken
- D'Or D., Bogaert P., Spatial prediction of categorical variables with the Bayesian Maximum Entropy approach: the Ooypolder case study, 10.1111/j.1365-2389.2004.00628.x
- Fienberg Stephen E., An Iterative Procedure for Estimation in Contingency Tables, 10.1214/aoms/1177696968
- Fienberg Stephen E., Rinaldo Alessandro, Maximum likelihood estimation in log-linear models, 10.1214/12-aos986
- Foody GM, See L, Fritz S, Van der Velde M, Perger C, Schill C, Boyd DS, Comber A (2015) Accurate attribute mapping from volunteered geographic information: issues of volunteer quantity and quality. Cartogr J 52:336–344
- Fritz Steffen, McCallum Ian, Schill Christian, Perger Christoph, Grillmayer Roland, Achard Frédéric, Kraxner Florian, Obersteiner Michael, Geo-Wiki.Org: The Use of Crowdsourcing to Improve Global Land Cover, 10.3390/rs1030345
- Fritz S, See LM, Rembold F (2010) Comparison of global and regional land cover maps with statistical information for the agricultural domain in Africa. Int J Remote Sens 25:1527–1532
- Fritz Steffen, You Liangzhi, Bun Andriy, See Linda, McCallum Ian, Schill Christian, Perger Christoph, Liu Junguo, Hansen Matt, Obersteiner Michael, Cropland for sub-Saharan Africa: A synergistic approach using five land cover data sets : A NEW CALIBRATED CROPLAND DATA SET, 10.1029/2010gl046213
- Gengler Sarah, Bogaert Patrick, Bayesian Data Fusion Applied to Soil Drainage Classes Spatial Mapping, 10.1007/s11004-015-9585-y
- Gengler Sarah, Bogaert Patrick, Integrating Crowdsourced Data with a Land Cover Product: A Bayesian Data Fusion Approach, 10.3390/rs8070545
- Goodchild Michael F., Li Linna, Assuring the quality of volunteered geographic information, 10.1016/j.spasta.2012.03.002
- Goovaerts P (1997) Geostatistics for natural resources evaluation (applied geostatistics). Oxford University Press, Oxford
- Huang Xiang, Li Jie, Liang Yuru, Wang Zhizhong, Guo Jianhua, Jiao Peng, Spatial hidden Markov chain models for estimation of petroleum reservoir categorical variables, 10.1007/s13202-016-0251-9
- Hunter Jane, Alabri Abdulmonem, van Ingen Catharine, Assessing the quality and trustworthiness of citizen science data : ASSESSING THE QUALITY AND TRUSTWORTHINESS OF CITIZEN SCIENCE DATA, 10.1002/cpe.2923
- Hurtt George C., Rosentrater Lynn, Frolking Steve, Moore Berrien, Linking remote-sensing estimates of land cover and census statistics on land use to produce maps of land use of the conterminous United States, 10.1029/2000gb001299
- Jafari Azam, Khademi Hossein, Finke Peter A., Van de Wauw Johan, Ayoubi Shamsollah, Spatial prediction of soil great groups by boosted regression trees using a limited point dataset in an arid region, southeastern Iran, 10.1016/j.geoderma.2014.04.029
- Jaynes E. T., Probability Theory : The Logic of Science, ISBN:9780511790423, 10.1017/cbo9780511790423
- Jin Chongyang, Zhu Jun, Steen-Adams Michelle M., Sain Stephan R., Gangnon Ronald E., Spatial multinomial regression models for nominal categorical data: a study of land cover in Northern Wisconsin, USA : SPATIAL MULTINOMIAL REGRESSION FOR NOMINAL CATEGORICAL DATA, 10.1002/env.2189
- Johnson Brian A., Iizuka Kotaro, Integrating OpenStreetMap crowdsourced data and Landsat time-series imagery for rapid land use/land cover (LULC) mapping: Case study of the Laguna de Bay area of the Philippines, 10.1016/j.apgeog.2015.12.006
- Kapur JN (2009) Maximum entropy models in science and engineering. New Age, New Delhi
- Kou Xiaokang, Jiang Lingmei, Bo Yanchen, Yan Shuang, Chai Linna, Estimation of Land Surface Temperature through Blending MODIS and AMSR-E Data with the Bayesian Maximum Entropy Method, 10.3390/rs8020105
- Messier Kyle P., Campbell Ted, Bradley Philip J., Serre Marc L., Estimation of Groundwater Radon in North Carolina Using Land Use Regression and Bayesian Maximum Entropy, 10.1021/acs.est.5b01503
- Muller C.L., Chapman L., Johnston S., Kidd C., Illingworth S., Foody G., Overeem A., Leigh R.R., Crowdsourcing for climate and atmospheric sciences: current status and future potential : CROWDSOURCING FOR CLIMATE AND ATMOSPHERIC SCIENCES, 10.1002/joc.4210
- Pérez-Hoyos A., García-Haro F.J., San-Miguel-Ayanz J., A methodology to generate a synergetic land-cover map by fusion of different land-cover products, 10.1016/j.jag.2012.04.011
- Poser K, Dransch D (2010) Volunteered geographic information for disaster management with application to rapid flood damage estimation. Geomatica 64:89–98
- See Linda, McCallum Ian, Fritz Steffen, Perger Christoph, Kraxner Florian, Obersteiner Michael, Baruah Ujjal Deka, Mili Nitashree, Kalita Nripen Ram, Mapping Cropland in Ethiopia Using Crowdsourcing, 10.4236/ijg.2013.46a1002
- See Linda, Fritz Steffen, You Liangzhi, Ramankutty Navin, Herrero Mario, Justice Chris, Becker-Reshef Inbal, Thornton Philip, Erb Karlheinz, Gong Peng, Tang Huajun, van der Velde Marijn, Ericksen Polly, McCallum Ian, Kraxner Florian, Obersteiner Michael, Improved global cropland data as an essential ingredient for food security, 10.1016/j.gfs.2014.10.004
- See Linda, Mooney Peter, Foody Giles, Bastin Lucy, Comber Alexis, Estima Jacinto, Fritz Steffen, Kerle Norman, Jiang Bin, Laakso Mari, Liu Hai-Ying, Milčinski Grega, Nikšič Matej, Painho Marco, Pődör Andrea, Olteanu-Raimond Ana-Maria, Rutzinger Martin, Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information, 10.3390/ijgi5050055
- Thenkabail PS (ed) (2015) Remotely sensed data characterization, classification, and accuracies (remote sensing handbook). CRC Press, Boca Raton
- Wahyudi Agung, Bartzke Mariana, Küster Eberhard, Bogaert Patrick, Maximum entropy estimation of a Benzene contaminated plume using ecotoxicological assays, 10.1016/j.envpol.2012.08.018
- Waller Lance A., Spatial Models for Categorical Data, 10.1002/0470011815.b2a10056
- Werner H, Hanke M, Neubauer A (2000) Regularization of inverse problems. Kluwer, Berlin
- Whittaker Joshua, McLennan Blythe, Handmer John, A review of informal volunteerism in emergencies and disasters: Definition, opportunities and challenges, 10.1016/j.ijdrr.2015.07.010
- Wrigley N (2002) Categorical data analysis for geographers and environmental scientists. Blackburn Press, Caldwell
- Wu Ximing, Calculation of maximum entropy densities with application to income distribution, 10.1016/s0304-4076(03)00114-3
- Xu Yadong, Serre Marc L., Reyes Jeanette, Vizuete William, Bayesian Maximum Entropy Integration of Ozone Observations and Model Predictions: A National Application, 10.1021/acs.est.6b00096
- Zook Matthew, Graham Mark, Shelton Taylor, Gorman Sean, Volunteered Geographic Information and Crowdsourcing Disaster Relief: A Case Study of the Haitian Earthquake, 10.2202/1948-4682.1069
Bibliographic reference |
Bogaert, Patrick ; Gengler, Sarah. Bayesian maximum entropy and data fusion for processing qualitative data: theory and application for crowdsourced
cropland occurrences in Ethiopia. In: Stochastic Environmental Research and Risk Assessment, Vol. 31, p. 1-17 (16 June 2017) |
Permanent URL |
http://hdl.handle.net/2078.1/185571 |