Predict the Criminals

0

0 votes
Easy
Problème

Problem Statement

There has been a surge in crimes committed in recent years, making crime a top cause of concern for law enforcement. If we are able to estimate whether someone is going to commit a crime in the future, we can take precautions and be prepared. You are given a dataset containing answers to various questions concerning the professional and private lives of several people. A few of them have been arrested for various small and large crimes in the past. Use the given data to predict if the people in the test data will commit a crime. The train data consists of 45718 rows, while the test data consists of 11430 rows.

(Update - 6 March, 2018)

Please note the evaluation metric has been changed from Precision to Mathews Correlation Coefficient due to ineffectiveness of Precision metric in this dataset.

The evaluation metric is Matthews correlation coefficient

Download Dataset

Join our slack channel to discuss ML and DL here.

Data Description

You are given three files to download: train, test and sample submission.

Variable Name Description
PERID Person ID
IFATHER FATHER IN HOUSEHOLD
NRCH17_2 RECODED # R's CHILDREN < 18 IN HOUSEHOLD
IRHHSIZ2 RECODE - IMPUTATION-REVISED # PERSONS IN HH
IIHHSIZ2 IMPUTATION INDICATOR
IRKI17_2 IMPUTATION-REVISED # KIDS AGED<18 IN HH
IIKI17_2 IRKI17_2-IMPUTATION INDICATOR
IRHH65_2 REC - IMPUTATION-REVISED # OF PER IN HH AGED>=65
IIHH65_2 IRHH65_2-IMPUTATION INDICATOR
PRXRETRY SELECTED PROXY UNAVAILABLE, OTHER PROXY AVAILABLE?
PRXYDATA IS PROXY ANSWERING INSURANCE/INCOME QS
MEDICARE COVERED BY MEDICARE
CAIDCHIP COVERED BY MEDICAID/CHIP
CHAMPUS COV BY TRICARE, CHAMPUS, CHAMPVA, VA, MILITARY
PRVHLTIN COVERED BY PRIVATE INSURANCE
GRPHLTIN PRIVATE PLAN OFFERED THROUGH EMPLOYER OR UNION
HLTINNOS COVERED BY HEALTH INSUR
HLCNOTYR ANYTIME DID NOT HAVE HEALTH INS/COVER PAST 12 MOS
HLCNOTMO PAST 12 MOS, HOW MANY MOS W/O COVERAGE
HLCLAST TIME SINCE LAST HAD HEALTH CARE COVERAGE
HLLOSRSN MAIN REASON STOPPED COVERED BY HEALTH INSURANCE
HLNVCOST COST TOO HIGH
HLNVOFFR EMPLOYER DOESN'T OFFER
HLNVREF INSURANCE COMPANY REFUSED COVERAGE
HLNVNEED DON'T NEED IT
HLNVSOR NEVER HAD HLTH INS SOME OTHER REASON
IRMCDCHP IMPUTATION REVISED CAIDCHIP
IIMCDCHP MEDICAID/CHIP - IMPUTATION INDICATOR
IRMEDICR MEDICARE - IMPUTATION REVISED
IIMEDICR MEDICARE - IMPUTATION INDICATOR
IRCHMPUS CHAMPUS - IMPUTATION REVISED
IICHMPUS CHAMPUS - IMPUTATION INDICATOR
IRPRVHLT PRIVATE HEALTH INSURANCE - IMPUTATION REVISED
IIPRVHLT PRIVATE HEALTH INSURANCE - IMPUTATION INDICATOR
IROTHHLT OTHER HEALTH INSURANCE - IMPUTATION REVISED
IIOTHHLT OTHER HEALTH INSURANCE - IMPUTATION INDICATOR
HLCALLFG FLAG IF EVERY FORM OF HEALTH INS REPORTED
HLCALL99 YES TO MEDICARE/MEDICAID/CHAMPUS/PRVHLTIN
ANYHLTI2 COVERED BY ANY HEALTH INSURANCE - RECODE
IRINSUR4 RC-OVERALL HEALTH INSURANCE - IMPUTATION REVISED
IIINSUR4 RC-OVERALL HEALTH INSURANCE - IMPUTATION INDICATOR
OTHINS RC-OTHER HEALTH INSURANCE
CELLNOTCL NOT A CELL PHONE
CELLWRKNG WORKING CELL PHONE
IRFAMSOC FAM RECEIVE SS OR RR PAYMENTS - IMPUTATION REVISED
IIFAMSOC FAM RECEIVE SS OR RR PAYMENTS - IMPUTATION INDICATOR
IRFAMSSI FAM RECEIVE SSI - IMPUTATION REVISED
IIFAMSSI FAM RECEIVE SSI - IMPUTATION INDICATOR
IRFSTAMP RESP/OTH FAM MEM REC FOOD STAMPS - IMPUTATION REVISED
IIFSTAMP RESP/OTH FAM MEM REC FOOD STAMPS - IMPUTATION INDICATOR
IRFAMPMT FAM RECEIVE PUBLIC ASSIST - IMPUTATION REVISED
IIFAMPMT FAM RECEIVE PUBLIC ASSIST - IMPUTATION INDICATOR
IRFAMSVC FAM REC WELFARE/JOB PL/CHILDCARE - IMPUTATION REVISED
IIFAMSVC FAM REC WELFARE/JOB PL/CHILDCARE - IMPUTATION INDICATOR
IRWELMOS IMP. REVISED - NO.OF MONTHS ON WELFARE
IIWELMOS NO OF MONTHS ON WELFARE - IMPUTATION INDICATOR
IRPINC3 RESP TOT INCOME (FINER CAT) - IMP REV
IRFAMIN3 RECODE - IMP.REVISED - TOT FAM INCOME
IIPINC3 RESP TOT INCOME (FINER CAT) - IMP INDIC
IIFAMIN3 IRFAMIN3 - IMPUTATION INDICATOR
GOVTPROG RC-PARTICIPATED IN ONE OR MORE GOVT ASSIST PROGRAMS
POVERTY3 RC-POVERTY LEVEL
TOOLONG RESP SAID INTERVIEW WAS TOO LONG
TROUBUND DID RESP HAVE TROUBLE UNDERSTANDING INTERVIEW
PDEN10 POPULATION DENSITY 2010
COUTYP2 COUNTY METRO/NONMETRO STATUS
MAIIN102 MAJORITY AMER INDIAN AREA INDICATOR FOR SEGMENT
AIIND102 AMER INDIAN AREA INDICATOR
ANALWT_C FIN PRSN-LEVEL SIMPLE WGHT
VESTR ANALYSIS STRATUM
VEREP ANALYSIS REPLICATE
Criminal Target Variable
Limite de temps: 5
Limite de mémoire: 256
Limite de source :
Editor Image

?