2020-08-03T14:45:23,491 Created temporary directory: /tmp/pip-ephem-wheel-cache-c2ro9x8m
2020-08-03T14:45:23,495 Created temporary directory: /tmp/pip-req-tracker-qi8cr40e
2020-08-03T14:45:23,496 Initialized build tracking at /tmp/pip-req-tracker-qi8cr40e
2020-08-03T14:45:23,496 Created build tracker: /tmp/pip-req-tracker-qi8cr40e
2020-08-03T14:45:23,497 Entered build tracker: /tmp/pip-req-tracker-qi8cr40e
2020-08-03T14:45:23,498 Created temporary directory: /tmp/pip-wheel-p7jn2a4e
2020-08-03T14:45:23,515 1 location(s) to search for versions of odus:
2020-08-03T14:45:23,515 * https://pypi.org/simple/odus/
2020-08-03T14:45:23,516 Fetching project page and analyzing links: https://pypi.org/simple/odus/
2020-08-03T14:45:23,516 Getting page https://pypi.org/simple/odus/
2020-08-03T14:45:23,519 Found index url https://pypi.org/simple
2020-08-03T14:45:23,725 Skipping link: No binaries permitted for odus: https://files.pythonhosted.org/packages/54/5c/314e6bd75cce4b45cfa5ff43a70ae97783a9f4c55b54b5874421db76e2c8/odus-0.0.1-py3-none-any.whl#sha256=0e3ab1c76df9b648112534048773c9611fbaa38831575c724ec6e59144fac86f (from https://pypi.org/simple/odus/)
2020-08-03T14:45:23,726 Found link https://files.pythonhosted.org/packages/64/96/703d2db924bce4c35daef3f2289bf3c4d793e1e1ed655969728a0ebfae80/odus-0.0.1.tar.gz#sha256=d8c345a55f5b07f1c53d62f7f72f5346df441c410f952eb7184f52367c4ae925 (from https://pypi.org/simple/odus/), version: 0.0.1
2020-08-03T14:45:23,727 Skipping link: No binaries permitted for odus: https://files.pythonhosted.org/packages/f6/ba/337ae3f8204b56312a4d55be33b6865da599a0a526b55d0ce394d9c9d8cb/odus-0.0.3-py3-none-any.whl#sha256=10d8a8a9aef138da0d54948a147d0ba0b410c9fca2bc98286c6a8d13c6e1ff47 (from https://pypi.org/simple/odus/)
2020-08-03T14:45:23,727 Found link https://files.pythonhosted.org/packages/4d/d9/c620386320ec5d657cf3865bbc4df315bdf0eeb1709789cddb96ae428371/odus-0.0.3.tar.gz#sha256=b02a21f2e2efa307cfa66b6ad89faa9029527664777ae4c43b25335fee0c5362 (from https://pypi.org/simple/odus/), version: 0.0.3
2020-08-03T14:45:23,728 Skipping link: No binaries permitted for odus: https://files.pythonhosted.org/packages/e4/99/517b778c38af9c1bfdebd76679a5435554b4f56f15a9a7951b9eec694b7d/odus-0.0.5-py3-none-any.whl#sha256=53222cf6d230a133df6d16a0e95de840c3ac70f7f32580e88ae7641638b87ac4 (from https://pypi.org/simple/odus/)
2020-08-03T14:45:23,728 Found link https://files.pythonhosted.org/packages/6a/84/a24a9f511406da5ebee3411751582d9a38ac77c57bb792b734ffe8710bc7/odus-0.0.5.tar.gz#sha256=761f997aa4eac4d7b120f2a55f46976e87568264a469be229ab8f556bb4b12b2 (from https://pypi.org/simple/odus/), version: 0.0.5
2020-08-03T14:45:23,729 Skipping link: No binaries permitted for odus: https://files.pythonhosted.org/packages/55/7e/5433d19e19618fc86242778ff09df3307c46a9f95f6e52f8ab69a22de31e/odus-0.0.6-py3-none-any.whl#sha256=0dc996d310b924f8055e79ede0d24f62eb0e24a16f4ad50d17f03a156ff736ab (from https://pypi.org/simple/odus/)
2020-08-03T14:45:23,729 Found link https://files.pythonhosted.org/packages/be/ea/a3d88b705ff73b7a7435335d1b2d6aa6202a78a9537d9ed219e4b77e428b/odus-0.0.6.tar.gz#sha256=e43b83b217218c592b938968f034b6f4045d5511baef54ec3c0b1bda60dbcad3 (from https://pypi.org/simple/odus/), version: 0.0.6
2020-08-03T14:45:23,738 Given no hashes to check 1 links for project 'odus': discarding no candidates
2020-08-03T14:45:23,739 Using version 0.0.6 (newest of versions: 0.0.6)
2020-08-03T14:45:23,744 Collecting odus==0.0.6
2020-08-03T14:45:23,748 Created temporary directory: /tmp/pip-unpack-nw9wyybq
2020-08-03T14:45:23,853 Downloading odus-0.0.6.tar.gz (15 kB)
2020-08-03T14:45:23,931 Added odus==0.0.6 from https://files.pythonhosted.org/packages/be/ea/a3d88b705ff73b7a7435335d1b2d6aa6202a78a9537d9ed219e4b77e428b/odus-0.0.6.tar.gz#sha256=e43b83b217218c592b938968f034b6f4045d5511baef54ec3c0b1bda60dbcad3 to build tracker '/tmp/pip-req-tracker-qi8cr40e'
2020-08-03T14:45:23,933 Running setup.py (path:/tmp/pip-wheel-p7jn2a4e/odus/setup.py) egg_info for package odus
2020-08-03T14:45:23,934 Created temporary directory: /tmp/pip-pip-egg-info-k3z809f5
2020-08-03T14:45:23,935 Running command python setup.py egg_info
2020-08-03T14:45:25,588 Setup params -------------------------------------------------------
2020-08-03T14:45:25,589 {
2020-08-03T14:45:25,589 "name": "odus",
2020-08-03T14:45:25,590 "version": "0.0.6",
2020-08-03T14:45:25,590 "url": "https://github.com/thorwhalen/odus",
2020-08-03T14:45:25,591 "author": "Thor Whalen",
2020-08-03T14:45:25,591 "author_email": "thorwhalen1@gmail.com",
2020-08-03T14:45:25,592 "license": "MIT",
2020-08-03T14:45:25,592 "include_package_data": true,
2020-08-03T14:45:25,593 "platforms": "any",
2020-08-03T14:45:25,593 "long_description": "
Table of Contents
\n\n\n\n```python\n# %load_ext autoreload\n# %autoreload 2\n```\n\n# Introduction\n\nODUS (for Older Drug User Study) contains data and tools to study the drug use of older drug users.\n\nEssentially, there are these are tools:\n\n- To get prepared data on the 119 \"trajectories\" describing 31 variables (drug use, social, etc.) over time of 119 different respondents.\n\n- To vizualize these trajectories in various ways\n\n- To create pdfs of any selection of these trajectories and variables\n\n- To make count tables for any combinations of the variables: Essential step of any Markovian or Bayesian analysis.\n\n- To make probability (joint or conditional) tables from any combination of the variables\n\n- To operate on these count and probability tables, thus enabling inference operations\n\n\n# Installation\n\nYou need to have python 3.7+ to run this notebook.\n\nAnd you'll need to have `odus`, which you get by doing\n\n```\npip install odus\n```\n\n(And if you don't have pip then, well... how to put it... ha ha ha!)\n\nBut if you're the type, you can also just get the source from `https://github.com/thorwhalen/odus`. \n\nOh, and pull requests etc. are welcome!\n\nStars, likes, references, and coffee also welcome.\n\nAnd if you want to donate: Donate to a charity that will help the people understand and make policies surrounding the use of substances.\n\nA simple flowchart about the architecture:\n\n
\n\n# Getting some resources\n\n\n```python\nfrom matplotlib.pylab import *\nfrom numpy import *\nimport seaborn as sns\n\nimport os\nfrom py2store.stores.local_store import RelativePathFormatStore\nfrom py2store.mixins import ReadOnlyMixin\nfrom py2store.base import Store\n\n\nfrom io import BytesIO\nfrom spyn.ppi.pot import Pot, ProbPot\nfrom collections import UserDict, Counter\nimport numpy as np\nimport pandas as pd\n\nfrom ut.ml.feature_extraction.sequential_var_sets import PVar, VarSet, DfData, VarSetFactory\nfrom IPython.display import Image\n\nfrom odus.analysis_utils import *\n\nfrom odus.dacc import DfStore, counts_of_kps, Dacc, VarSetCountsStore, \\\n mk_pvar_struct, PotStore, _commun_columns_of_dfs, Struct, mk_pvar_str_struct, VarStr\n\nfrom odus.plot_utils import plot_life_course\n```\n\n\n```python\nfrom odus import data_dir, data_path_of\nsurvey_dir = data_dir\ndata_dir\n```\n\n\n\n\n '/D/Dropbox/dev/p3/proj/odus/odus/data'\n\n\n\n\n```python\ndf_store = DfStore(data_dir + '/{}.xlsx')\nlen(df_store)\ncstore = VarSetCountsStore(df_store)\nv = mk_pvar_struct(df_store, only_for_cols_in_all_dfs=True)\ns = mk_pvar_str_struct(v)\nf, df = cstore.df_store.head()\npstore = PotStore(df_store)\n```\n\n# Poking around\n\n## df_store\n\nA df_store is a key-value store where the key is the xls file and the value is the prepared dataframe\n\n\n```python\nlen(df_store)\n```\n\n\n\n\n 119\n\n\n\n\n```python\nit = iter(df_store.values())\nfor i in range(5): # skip five first\n _ = next(it)\ndf = next(it) # get the one I want\ndf.head(3)\n```\n\n\n\n\n\n\n
\n \n \n category | \n RURAL | \n SUBURBAN | \n URBAN/CITY | \n HOMELESS | \n INCARCERATION | \n WORK | \n SON/DAUGHTER | \n SIBLING | \n FATHER/MOTHER | \n SPOUSE | \n ... | \n METHAMPHETAMINE | \n AS PRESCRIBED OPIOID | \n NOT AS PRESCRIBED OPIOID | \n HEROIN | \n OTHER OPIOID | \n INJECTED | \n IN TREATMENT | \n Selects States below | \n Georgia | \n Pennsylvania | \n
\n \n age | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n
\n \n \n \n 11 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n
\n \n 12 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n 0 | \n ... | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n
\n \n 13 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n
\n \n
\n
3 rows \u00d7 31 columns
\n
\n\n\n\n\n```python\nprint(df.columns.values)\n```\n\n ['RURAL' 'SUBURBAN' 'URBAN/CITY' 'HOMELESS' 'INCARCERATION' 'WORK'\n 'SON/DAUGHTER' 'SIBLING' 'FATHER/MOTHER' 'SPOUSE'\n 'OTHER (WHO?, FILL IN BRACKETS HERE)' 'FRIEND USER' 'FRIEND NON USER'\n 'MENTAL ILLNESS' 'PHYSICAL ILLNESS' 'LOSS OF LOVED ONE' 'TOBACCO'\n 'MARIJUANA' 'ALCOHOL' 'HAL/LSD/XTC/CLUBDRUG' 'COCAINE/CRACK'\n 'METHAMPHETAMINE' 'AS PRESCRIBED OPIOID' 'NOT AS PRESCRIBED OPIOID'\n 'HEROIN' 'OTHER OPIOID' 'INJECTED' 'IN TREATMENT' 'Selects States below'\n 'Georgia' 'Pennsylvania']\n\n\n\n```python\nt = df[['ALCOHOL', 'TOBACCO']]\nt.head(3)\n```\n\n\n\n\n\n\n
\n \n \n category | \n ALCOHOL | \n TOBACCO | \n
\n \n age | \n | \n | \n
\n \n \n \n 11 | \n 0 | \n 0 | \n
\n \n 12 | \n 0 | \n 0 | \n
\n \n 13 | \n 0 | \n 0 | \n
\n \n
\n
\n\n\n\n\n```python\nc = Counter()\nfor i, r in t.iterrows():\n c.update([tuple(r.to_list())])\nc\n```\n\n\n\n\n Counter({(0, 0): 6, (1, 0): 4, (1, 1): 9, (0, 1): 2})\n\n\n\n\n```python\ndef count_tuples(dataframe):\n c = Counter()\n for i, r in dataframe.iterrows():\n c.update([tuple(r.to_list())])\n return c\n```\n\n\n```python\nfields = ['ALCOHOL', 'TOBACCO']\n# do it for every one\nc = Counter()\nfor df in df_store.values():\n c.update(count_tuples(df[fields]))\nc\n```\n\n\n\n\n Counter({(0, 1): 903, (1, 1): 1343, (0, 0): 240, (1, 0): 179})\n\n\n\n\n```python\npd.Series(c)\n```\n\n\n\n\n 0 1 903\n 1 1 1343\n 0 0 240\n 1 0 179\n dtype: int64\n\n\n\n\n```python\n# Powerful! You can use that with several pairs and get some nice probabilities. Look up Naive Bayes.\n```\n\n## Viewing trajectories\n\n\n```python\nimport itertools\nfrom functools import partial\nfrom odus.util import write_images\nfrom odus.plot_utils import plot_life, life_plots, write_trajectories_to_file\n\nihead = lambda it: itertools.islice(it, 0, 5)\n```\n\n### Viewing a single trajectory\n\n\n```python\nk = next(iter(df_store)) # get the first key\nprint(f\"k: {k}\") # print it\nplot_life(df_store[k]) # plot the trajectory\n```\n\n k: surveys/B24.xlsx\n\n\n\n\n\n\n\n```python\nplot_life(df_store[k], fields=[s.in_treatment, s.injected]) # only want two fields\n```\n\n\n\n\n\n### Flip over all (or some) trajectories\n\n\n```python\ngen = life_plots(df_store)\n```\n\n\n```python\nnext(gen) # launch to get the next trajectory\n```\n\n\n\n\n \n\n\n\n\n\n\n\nGet three trajectories, but only over two fields.\n\n\n```python\n# fields = [s.in_treatment, s.injected]\nfields = [s.physical_illness, s.as_prescribed_opioid, s.heroin, s.other_opioid]\nkeys = list(df_store)[:10]\n# print(f\"keys={keys}\")\naxs = [x for x in life_plots(df_store, fields, keys=keys)];\n```\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n## Making a pdf of trajectories\n\n\n```python\nwrite_trajectories_to_file(df_store, fields, keys, fp='three_respondents_two_fields.pdf');\n```\n\n\n```python\nwrite_trajectories_to_file(df_store, fp='all_respondents_all_fields.pdf');\n```\n\n\n```python\n \n```\n\n## Demo s and v\n\n\n```python\nprint(list(filter(lambda x: not x.startswith('__'), dir(s))))\n```\n\n ['alcohol', 'as_prescribed_opioid', 'cocaine_crack', 'father_mother', 'hal_lsd_xtc_clubdrug', 'heroin', 'homeless', 'in_treatment', 'incarceration', 'injected', 'loss_of_loved_one', 'marijuana', 'mental_illness', 'methamphetamine', 'not_as_prescribed_opioid', 'other_opioid', 'physical_illness', 'rural', 'sibling', 'son_daughter', 'suburban', 'tobacco', 'urban_city', 'work']\n\n\n\n```python\ns.heroin\n```\n\n\n\n\n 'HEROIN'\n\n\n\n\n```python\nv.heroin\n```\n\n\n\n\n PVar('HEROIN', 0)\n\n\n\n\n```python\nv.heroin - 1\n```\n\n\n\n\n PVar('HEROIN', -1)\n\n\n\n## cstore\n\n\n```python\n# cstore[v.alcohol, v.tobacco]\ncstore[v.as_prescribed_opioid-1, v.heroin]\n```\n\n\n\n\n Counter({(0, 0): 1026, (1, 0): 264, (0, 1): 1108, (1, 1): 148})\n\n\n\n\n```python\npd.Series(cstore[v.as_prescribed_opioid-1, v.heroin])\n```\n\n\n\n\n 0 0 1026\n 1 0 264\n 0 1 1108\n 1 1 148\n dtype: int64\n\n\n\n\n```python\ncstore[v.alcohol, v.tobacco, v.heroin]\n```\n\n\n\n\n Counter({(0, 0, 1): 427,\n (1, 0, 1): 656,\n (1, 1, 1): 687,\n (0, 0, 0): 189,\n (0, 1, 1): 476,\n (0, 1, 0): 51,\n (1, 0, 0): 133,\n (1, 1, 0): 46})\n\n\n\n\n```python\ncstore[v.alcohol-1, v.alcohol]\n```\n\n\n\n\n Counter({(0, 0): 994, (1, 1): 1375, (1, 0): 90, (0, 1): 87})\n\n\n\n\n```python\ncstore[v.alcohol-1, v.alcohol, v.tobacco]\n```\n\n\n\n\n Counter({(0, 0, 1): 807,\n (1, 1, 1): 1220,\n (1, 0, 0): 26,\n (0, 1, 1): 76,\n (0, 0, 0): 187,\n (1, 1, 0): 155,\n (0, 1, 0): 11,\n (1, 0, 1): 64})\n\n\n\n\n```python\nt = pd.Series(cstore[v.alcohol-1, v.alcohol, v.tobacco])\nt.loc[t.index]\n```\n\n\n\n\n \n\n\n\n## pstore\n\n\n```python\nt = pstore[s.alcohol-1, s.alcohol]\nt\n```\n\n\n\n\n pval\n ALCOHOL-1 ALCOHOL \n 0 0 994\n 1 87\n 1 0 90\n 1 1375\n\n\n\n\n```python\nt.tb\n```\n\n\n\n\n\n\n
\n \n \n | \n ALCOHOL-1 | \n ALCOHOL | \n pval | \n
\n \n \n \n | \n 0 | \n 0 | \n 994 | \n
\n \n | \n 0 | \n 1 | \n 87 | \n
\n \n | \n 1 | \n 0 | \n 90 | \n
\n \n | \n 1 | \n 1 | \n 1375 | \n
\n \n
\n
\n\n\n\n\n```python\nt / []\n```\n\n\n\n\n pval\n ALCOHOL-1 ALCOHOL \n 0 0 0.390416\n 1 0.034171\n 1 0 0.035350\n 1 0.540063\n\n\n\n\n```python\nt[s.alcohol-1]\n```\n\n\n\n\n pval\n ALCOHOL-1 \n 0 1081\n 1 1465\n\n\n\n\n```python\nt / t[s.alcohol-1] # cond prob!\n```\n\n\n\n\n pval\n ALCOHOL-1 ALCOHOL \n 0 0 0.919519\n 1 0.080481\n 1 0 0.061433\n 1 0.938567\n\n\n\n\n```python\ntt = pstore[s.alcohol, s.tobacco]\ntt\n```\n\n\n\n\n pval\n ALCOHOL TOBACCO \n 0 0 240\n 1 903\n 1 0 179\n 1 1343\n\n\n\n\n```python\ntt / tt[s.alcohol]\n```\n\n\n\n\n pval\n ALCOHOL TOBACCO \n 0 0 0.209974\n 1 0.790026\n 1 0 0.117608\n 1 0.882392\n\n\n\n\n```python\ntt / tt[s.tobacco]\n```\n\n\n\n\n pval\n ALCOHOL TOBACCO \n 0 0 0.572792\n 1 0 0.427208\n 0 1 0.402048\n 1 1 0.597952\n\n\n\n\n```python\n\n```\n\n## Scrap place\n\n\n```python\nt = pstore[s.as_prescribed_opioid-1, s.heroin-1, s.heroin]\nt\n\n```\n\n\n\n\n pval\n AS PRESCRIBED OPIOID-1 HEROIN-1 HEROIN \n 0 0 0 927\n 1 172\n 1 0 99\n 1 936\n 1 0 0 249\n 1 33\n 1 0 15\n 1 115\n\n\n\n\n```python\ntt = t / t[s.as_prescribed_opioid-1, s.heroin-1] # cond prob!\ntt\n```\n\n\n\n\n pval\n AS PRESCRIBED OPIOID-1 HEROIN-1 HEROIN \n 0 0 0 0.843494\n 1 0.156506\n 1 0 0.095652\n 1 0.904348\n 1 0 0 0.882979\n 1 0.117021\n 1 0 0.115385\n 1 0.884615\n\n\n\n\n```python\ntt.tb\n```\n\n\n\n\n\n\n
\n \n \n | \n AS PRESCRIBED OPIOID-1 | \n HEROIN-1 | \n HEROIN | \n pval | \n
\n \n \n \n | \n 0 | \n 0 | \n 0 | \n 0.843494 | \n
\n \n | \n 0 | \n 0 | \n 1 | \n 0.156506 | \n
\n \n | \n 0 | \n 1 | \n 0 | \n 0.095652 | \n
\n \n | \n 0 | \n 1 | \n 1 | \n 0.904348 | \n
\n \n | \n 1 | \n 0 | \n 0 | \n 0.882979 | \n
\n \n | \n 1 | \n 0 | \n 1 | \n 0.117021 | \n
\n \n | \n 1 | \n 1 | \n 0 | \n 0.115385 | \n
\n \n | \n 1 | \n 1 | \n 1 | \n 0.884615 | \n
\n \n
\n
\n\n\n\n```\nAS PRESCRIBED OPIOID-1\tHEROIN-1\tHEROIN\t\n0\t0\t0\t0.843494\n0\t0\t1\t0.156506\n1\t0\t0\t0.882979\n1\t0\t1\t0.117021\n```\n\n\n```python\n0.117021 / 0.156506\n```\n\n\n\n\n 0.7477093529960512\n\n\n\n\n```python\n\n```\n\n\n```python\nprob_of_heroin_given_presc_op = 0.359223\nprob_of_heroin_given_not_presc_op = 0.519213\n\nprob_of_heroin_given_presc_op / prob_of_heroin_given_not_presc_op\n```\n\n\n\n\n 0.6918605658949217\n\n\n\n\n```python\nprob_of_heroin_given_not_presc_op / prob_of_heroin_given_presc_op\n```\n\n\n\n\n 1.4453779407220584\n\n\n\n# Potential Calculus Experimentations\n\n\n```python\n# survey_dir = '/D/Dropbox/others/Miriam/python/ProcessedSurveys'\ndf_store = DfStore(survey_dir + '/{}.xlsx')\nlen(df_store)\n```\n\n\n\n\n 119\n\n\n\n\n```python\ncstore = VarSetCountsStore(df_store)\nv = mk_pvar_struct(df_store, only_for_cols_in_all_dfs=True)\ns = mk_pvar_str_struct(v)\nf, df = cstore.df_store.head()\ndf.head(3)\n```\n\n\n\n\n\n\n
\n \n \n category | \n RURAL | \n SUBURBAN | \n URBAN/CITY | \n HOMELESS | \n INCARCERATION | \n WORK | \n SON/DAUGHTER | \n SIBLING | \n FATHER/MOTHER | \n SPOUSE | \n ... | \n HAL/LSD/XTC/CLUBDRUG | \n COCAINE/CRACK | \n METHAMPHETAMINE | \n AS PRESCRIBED OPIOID | \n NOT AS PRESCRIBED OPIOID | \n HEROIN | \n OTHER OPIOID | \n INJECTED | \n IN TREATMENT | \n Massachusetts | \n
\n \n age | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n
\n \n \n \n 16 | \n 0 | \n 1 | \n 0 | \n 0 | \n 1 | \n 0 | \n 1 | \n 1 | \n 1 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n
\n \n 17 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 1 | \n 1 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n
\n \n 18 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 1 | \n 1 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n
\n \n
\n
3 rows \u00d7 29 columns
\n
\n\n\n\n\n```python\ncstore = VarSetCountsStore(df_store)\ncstore.mk_pvar_attrs()\n```\n\n\n```python\nfrom odus.dacc import DfStore, counts_of_kps, Dacc, plot_life_course, VarSetCountsStore, mk_pvar_struct, PotStore\npstore = PotStore(df_store)\npstore.mk_pvar_attrs()\np = pstore[v.homeless - 1, v.incarceration]\np\n```\n\n\n\n\n pval\n HOMELESS-1 INCARCERATION \n 0 0 1690\n 1 577\n 1 0 192\n 1 87\n\n\n\n\n```python\np / []\n```\n\n\n\n\n pval\n HOMELESS-1 INCARCERATION \n 0 0 0.663786\n 1 0.226630\n 1 0 0.075412\n 1 0.034171\n\n\n\n\n```python\npstore[v.incarceration]\n```\n\n\n\n\n pval\n INCARCERATION \n 0 1989\n 1 676\n\n\n\n\n```python\npstore[v.alcohol-1, v.loss_of_loved_one]\n```\n\n\n\n\n pval\n ALCOHOL-1 LOSS OF LOVED ONE \n 0 0 990\n 1 91\n 1 0 1321\n 1 144\n\n\n\n\n```python\ntw = pstore[v.tobacco, v.work]\nmw = pstore[v.marijuana, v.work]\naw = pstore[v.alcohol, v.work]\nw = pstore[v.work]\n\n```\n\n\n```python\nevid_t = Pot.from_hard_evidence(**{s.tobacco: 1})\nevid_m = Pot.from_hard_evidence(**{s.marijuana: 1})\nevid_a = Pot.from_hard_evidence(**{s.alcohol: 1})\nevid_a\n```\n\n\n\n\n pval\n ALCOHOL \n 1 1\n\n\n\n\n```python\naw\n```\n\n\n\n\n pval\n ALCOHOL WORK \n 0 0 431\n 1 712\n 1 0 448\n 1 1074\n\n\n\n\n```python\nw / []\n```\n\n\n\n\n pval\n WORK \n 0 0.329831\n 1 0.670169\n\n\n\n\n```python\n(evid_m * mw) / []\n```\n\n\n\n\n pval\n MARIJUANA WORK \n 1 0 0.350603\n 1 0.649397\n\n\n\n\n```python\n(evid_t * tw) / []\n```\n\n\n\n\n pval\n TOBACCO WORK \n 1 0 0.313001\n 1 0.686999\n\n\n\n\n```python\n(evid_a * aw) / []\n```\n\n\n\n\n pval\n ALCOHOL WORK \n 1 0 0.29435\n 1 0.70565\n\n\n\n# Extra scrap\n\n\n```python\n# from graphviz import Digraph\n# Digraph(body=\"\"\"\n# raw -> data -> count -> prob\n# raw [label=\"excel files (one per respondent)\" shape=folder]\n# data [label=\"dataframes\" shape=folder]\n# count [label=\"counts for any combinations of the variables in the data\" shape=box3d]\n# prob [label=\"probabilities for any combinations of the variables in the data\" shape=box3d]\n# \"\"\".split('\\n'))\n```\n\n\n```python\n\n```\n",
2020-08-03T14:45:25,595 "long_description_content_type": "text/markdown",
2020-08-03T14:45:25,595 "install_requires": [
2020-08-03T14:45:25,596 "py2store",
2020-08-03T14:45:25,596 "pandas",
2020-08-03T14:45:25,597 "numpy",
2020-08-03T14:45:25,598 "Pillow",
2020-08-03T14:45:25,598 "spyn",
2020-08-03T14:45:25,598 "matplotlib",
2020-08-03T14:45:25,599 "openpyxl",
2020-08-03T14:45:25,599 "argh"
2020-08-03T14:45:25,600 ],
2020-08-03T14:45:25,600 "description": "Tools to provide easy access to prepared data to data scientists that can't be asked.",
2020-08-03T14:45:25,601 "keywords": [
2020-08-03T14:45:25,601 "data",
2020-08-03T14:45:25,602 "data access",
2020-08-03T14:45:25,602 "drug use",
2020-08-03T14:45:25,603 "markov",
2020-08-03T14:45:25,603 "bayesian"
2020-08-03T14:45:25,603 ]
2020-08-03T14:45:25,937 }
2020-08-03T14:45:25,938 --------------------------------------------------------------------
2020-08-03T14:45:25,938 running egg_info
2020-08-03T14:45:25,943 creating /tmp/pip-pip-egg-info-k3z809f5/odus.egg-info
2020-08-03T14:45:25,960 writing /tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/PKG-INFO
2020-08-03T14:45:25,970 writing dependency_links to /tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/dependency_links.txt
2020-08-03T14:45:25,976 writing requirements to /tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/requires.txt
2020-08-03T14:45:25,978 writing top-level names to /tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/top_level.txt
2020-08-03T14:45:25,983 writing manifest file '/tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/SOURCES.txt'
2020-08-03T14:45:25,990 reading manifest file '/tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/SOURCES.txt'
2020-08-03T14:45:26,000 writing manifest file '/tmp/pip-pip-egg-info-k3z809f5/odus.egg-info/SOURCES.txt'
2020-08-03T14:45:26,122 Source in /tmp/pip-wheel-p7jn2a4e/odus has version 0.0.6, which satisfies requirement odus==0.0.6 from https://files.pythonhosted.org/packages/be/ea/a3d88b705ff73b7a7435335d1b2d6aa6202a78a9537d9ed219e4b77e428b/odus-0.0.6.tar.gz#sha256=e43b83b217218c592b938968f034b6f4045d5511baef54ec3c0b1bda60dbcad3
2020-08-03T14:45:26,124 Removed odus==0.0.6 from https://files.pythonhosted.org/packages/be/ea/a3d88b705ff73b7a7435335d1b2d6aa6202a78a9537d9ed219e4b77e428b/odus-0.0.6.tar.gz#sha256=e43b83b217218c592b938968f034b6f4045d5511baef54ec3c0b1bda60dbcad3 from build tracker '/tmp/pip-req-tracker-qi8cr40e'
2020-08-03T14:45:26,142 Building wheels for collected packages: odus
2020-08-03T14:45:26,152 Created temporary directory: /tmp/pip-wheel-1t5d36f_
2020-08-03T14:45:26,152 Building wheel for odus (setup.py): started
2020-08-03T14:45:26,153 Destination directory: /tmp/pip-wheel-1t5d36f_
2020-08-03T14:45:26,153 Running command /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-wheel-p7jn2a4e/odus/setup.py'"'"'; __file__='"'"'/tmp/pip-wheel-p7jn2a4e/odus/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-1t5d36f_
2020-08-03T14:45:27,738 Setup params -------------------------------------------------------
2020-08-03T14:45:27,739 {
2020-08-03T14:45:27,740 "name": "odus",
2020-08-03T14:45:27,740 "version": "0.0.6",
2020-08-03T14:45:27,741 "url": "https://github.com/thorwhalen/odus",
2020-08-03T14:45:27,741 "author": "Thor Whalen",
2020-08-03T14:45:27,742 "author_email": "thorwhalen1@gmail.com",
2020-08-03T14:45:27,742 "license": "MIT",
2020-08-03T14:45:27,743 "include_package_data": true,
2020-08-03T14:45:27,743 "platforms": "any",
2020-08-03T14:45:27,744 "long_description": "Table of Contents
\n\n\n\n```python\n# %load_ext autoreload\n# %autoreload 2\n```\n\n# Introduction\n\nODUS (for Older Drug User Study) contains data and tools to study the drug use of older drug users.\n\nEssentially, there are these are tools:\n\n- To get prepared data on the 119 \"trajectories\" describing 31 variables (drug use, social, etc.) over time of 119 different respondents.\n\n- To vizualize these trajectories in various ways\n\n- To create pdfs of any selection of these trajectories and variables\n\n- To make count tables for any combinations of the variables: Essential step of any Markovian or Bayesian analysis.\n\n- To make probability (joint or conditional) tables from any combination of the variables\n\n- To operate on these count and probability tables, thus enabling inference operations\n\n\n# Installation\n\nYou need to have python 3.7+ to run this notebook.\n\nAnd you'll need to have `odus`, which you get by doing\n\n```\npip install odus\n```\n\n(And if you don't have pip then, well... how to put it... ha ha ha!)\n\nBut if you're the type, you can also just get the source from `https://github.com/thorwhalen/odus`. \n\nOh, and pull requests etc. are welcome!\n\nStars, likes, references, and coffee also welcome.\n\nAnd if you want to donate: Donate to a charity that will help the people understand and make policies surrounding the use of substances.\n\nA simple flowchart about the architecture:\n\n
\n\n# Getting some resources\n\n\n```python\nfrom matplotlib.pylab import *\nfrom numpy import *\nimport seaborn as sns\n\nimport os\nfrom py2store.stores.local_store import RelativePathFormatStore\nfrom py2store.mixins import ReadOnlyMixin\nfrom py2store.base import Store\n\n\nfrom io import BytesIO\nfrom spyn.ppi.pot import Pot, ProbPot\nfrom collections import UserDict, Counter\nimport numpy as np\nimport pandas as pd\n\nfrom ut.ml.feature_extraction.sequential_var_sets import PVar, VarSet, DfData, VarSetFactory\nfrom IPython.display import Image\n\nfrom odus.analysis_utils import *\n\nfrom odus.dacc import DfStore, counts_of_kps, Dacc, VarSetCountsStore, \\\n mk_pvar_struct, PotStore, _commun_columns_of_dfs, Struct, mk_pvar_str_struct, VarStr\n\nfrom odus.plot_utils import plot_life_course\n```\n\n\n```python\nfrom odus import data_dir, data_path_of\nsurvey_dir = data_dir\ndata_dir\n```\n\n\n\n\n '/D/Dropbox/dev/p3/proj/odus/odus/data'\n\n\n\n\n```python\ndf_store = DfStore(data_dir + '/{}.xlsx')\nlen(df_store)\ncstore = VarSetCountsStore(df_store)\nv = mk_pvar_struct(df_store, only_for_cols_in_all_dfs=True)\ns = mk_pvar_str_struct(v)\nf, df = cstore.df_store.head()\npstore = PotStore(df_store)\n```\n\n# Poking around\n\n## df_store\n\nA df_store is a key-value store where the key is the xls file and the value is the prepared dataframe\n\n\n```python\nlen(df_store)\n```\n\n\n\n\n 119\n\n\n\n\n```python\nit = iter(df_store.values())\nfor i in range(5): # skip five first\n _ = next(it)\ndf = next(it) # get the one I want\ndf.head(3)\n```\n\n\n\n\n\n\n
\n \n \n category | \n RURAL | \n SUBURBAN | \n URBAN/CITY | \n HOMELESS | \n INCARCERATION | \n WORK | \n SON/DAUGHTER | \n SIBLING | \n FATHER/MOTHER | \n SPOUSE | \n ... | \n METHAMPHETAMINE | \n AS PRESCRIBED OPIOID | \n NOT AS PRESCRIBED OPIOID | \n HEROIN | \n OTHER OPIOID | \n INJECTED | \n IN TREATMENT | \n Selects States below | \n Georgia | \n Pennsylvania | \n
\n \n age | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n
\n \n \n \n 11 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n
\n \n 12 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n 0 | \n ... | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n
\n \n 13 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 0 | \n
\n \n
\n
3 rows \u00d7 31 columns
\n
\n\n\n\n\n```python\nprint(df.columns.values)\n```\n\n ['RURAL' 'SUBURBAN' 'URBAN/CITY' 'HOMELESS' 'INCARCERATION' 'WORK'\n 'SON/DAUGHTER' 'SIBLING' 'FATHER/MOTHER' 'SPOUSE'\n 'OTHER (WHO?, FILL IN BRACKETS HERE)' 'FRIEND USER' 'FRIEND NON USER'\n 'MENTAL ILLNESS' 'PHYSICAL ILLNESS' 'LOSS OF LOVED ONE' 'TOBACCO'\n 'MARIJUANA' 'ALCOHOL' 'HAL/LSD/XTC/CLUBDRUG' 'COCAINE/CRACK'\n 'METHAMPHETAMINE' 'AS PRESCRIBED OPIOID' 'NOT AS PRESCRIBED OPIOID'\n 'HEROIN' 'OTHER OPIOID' 'INJECTED' 'IN TREATMENT' 'Selects States below'\n 'Georgia' 'Pennsylvania']\n\n\n\n```python\nt = df[['ALCOHOL', 'TOBACCO']]\nt.head(3)\n```\n\n\n\n\n\n\n
\n \n \n category | \n ALCOHOL | \n TOBACCO | \n
\n \n age | \n | \n | \n
\n \n \n \n 11 | \n 0 | \n 0 | \n
\n \n 12 | \n 0 | \n 0 | \n
\n \n 13 | \n 0 | \n 0 | \n
\n \n
\n
\n\n\n\n\n```python\nc = Counter()\nfor i, r in t.iterrows():\n c.update([tuple(r.to_list())])\nc\n```\n\n\n\n\n Counter({(0, 0): 6, (1, 0): 4, (1, 1): 9, (0, 1): 2})\n\n\n\n\n```python\ndef count_tuples(dataframe):\n c = Counter()\n for i, r in dataframe.iterrows():\n c.update([tuple(r.to_list())])\n return c\n```\n\n\n```python\nfields = ['ALCOHOL', 'TOBACCO']\n# do it for every one\nc = Counter()\nfor df in df_store.values():\n c.update(count_tuples(df[fields]))\nc\n```\n\n\n\n\n Counter({(0, 1): 903, (1, 1): 1343, (0, 0): 240, (1, 0): 179})\n\n\n\n\n```python\npd.Series(c)\n```\n\n\n\n\n 0 1 903\n 1 1 1343\n 0 0 240\n 1 0 179\n dtype: int64\n\n\n\n\n```python\n# Powerful! You can use that with several pairs and get some nice probabilities. Look up Naive Bayes.\n```\n\n## Viewing trajectories\n\n\n```python\nimport itertools\nfrom functools import partial\nfrom odus.util import write_images\nfrom odus.plot_utils import plot_life, life_plots, write_trajectories_to_file\n\nihead = lambda it: itertools.islice(it, 0, 5)\n```\n\n### Viewing a single trajectory\n\n\n```python\nk = next(iter(df_store)) # get the first key\nprint(f\"k: {k}\") # print it\nplot_life(df_store[k]) # plot the trajectory\n```\n\n k: surveys/B24.xlsx\n\n\n\n\n\n\n\n```python\nplot_life(df_store[k], fields=[s.in_treatment, s.injected]) # only want two fields\n```\n\n\n\n\n\n### Flip over all (or some) trajectories\n\n\n```python\ngen = life_plots(df_store)\n```\n\n\n```python\nnext(gen) # launch to get the next trajectory\n```\n\n\n\n\n \n\n\n\n\n\n\n\nGet three trajectories, but only over two fields.\n\n\n```python\n# fields = [s.in_treatment, s.injected]\nfields = [s.physical_illness, s.as_prescribed_opioid, s.heroin, s.other_opioid]\nkeys = list(df_store)[:10]\n# print(f\"keys={keys}\")\naxs = [x for x in life_plots(df_store, fields, keys=keys)];\n```\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n## Making a pdf of trajectories\n\n\n```python\nwrite_trajectories_to_file(df_store, fields, keys, fp='three_respondents_two_fields.pdf');\n```\n\n\n```python\nwrite_trajectories_to_file(df_store, fp='all_respondents_all_fields.pdf');\n```\n\n\n```python\n \n```\n\n## Demo s and v\n\n\n```python\nprint(list(filter(lambda x: not x.startswith('__'), dir(s))))\n```\n\n ['alcohol', 'as_prescribed_opioid', 'cocaine_crack', 'father_mother', 'hal_lsd_xtc_clubdrug', 'heroin', 'homeless', 'in_treatment', 'incarceration', 'injected', 'loss_of_loved_one', 'marijuana', 'mental_illness', 'methamphetamine', 'not_as_prescribed_opioid', 'other_opioid', 'physical_illness', 'rural', 'sibling', 'son_daughter', 'suburban', 'tobacco', 'urban_city', 'work']\n\n\n\n```python\ns.heroin\n```\n\n\n\n\n 'HEROIN'\n\n\n\n\n```python\nv.heroin\n```\n\n\n\n\n PVar('HEROIN', 0)\n\n\n\n\n```python\nv.heroin - 1\n```\n\n\n\n\n PVar('HEROIN', -1)\n\n\n\n## cstore\n\n\n```python\n# cstore[v.alcohol, v.tobacco]\ncstore[v.as_prescribed_opioid-1, v.heroin]\n```\n\n\n\n\n Counter({(0, 0): 1026, (1, 0): 264, (0, 1): 1108, (1, 1): 148})\n\n\n\n\n```python\npd.Series(cstore[v.as_prescribed_opioid-1, v.heroin])\n```\n\n\n\n\n 0 0 1026\n 1 0 264\n 0 1 1108\n 1 1 148\n dtype: int64\n\n\n\n\n```python\ncstore[v.alcohol, v.tobacco, v.heroin]\n```\n\n\n\n\n Counter({(0, 0, 1): 427,\n (1, 0, 1): 656,\n (1, 1, 1): 687,\n (0, 0, 0): 189,\n (0, 1, 1): 476,\n (0, 1, 0): 51,\n (1, 0, 0): 133,\n (1, 1, 0): 46})\n\n\n\n\n```python\ncstore[v.alcohol-1, v.alcohol]\n```\n\n\n\n\n Counter({(0, 0): 994, (1, 1): 1375, (1, 0): 90, (0, 1): 87})\n\n\n\n\n```python\ncstore[v.alcohol-1, v.alcohol, v.tobacco]\n```\n\n\n\n\n Counter({(0, 0, 1): 807,\n (1, 1, 1): 1220,\n (1, 0, 0): 26,\n (0, 1, 1): 76,\n (0, 0, 0): 187,\n (1, 1, 0): 155,\n (0, 1, 0): 11,\n (1, 0, 1): 64})\n\n\n\n\n```python\nt = pd.Series(cstore[v.alcohol-1, v.alcohol, v.tobacco])\nt.loc[t.index]\n```\n\n\n\n\n \n\n\n\n## pstore\n\n\n```python\nt = pstore[s.alcohol-1, s.alcohol]\nt\n```\n\n\n\n\n pval\n ALCOHOL-1 ALCOHOL \n 0 0 994\n 1 87\n 1 0 90\n 1 1375\n\n\n\n\n```python\nt.tb\n```\n\n\n\n\n\n\n
\n \n \n | \n ALCOHOL-1 | \n ALCOHOL | \n pval | \n
\n \n \n \n | \n 0 | \n 0 | \n 994 | \n
\n \n | \n 0 | \n 1 | \n 87 | \n
\n \n | \n 1 | \n 0 | \n 90 | \n
\n \n | \n 1 | \n 1 | \n 1375 | \n
\n \n
\n
\n\n\n\n\n```python\nt / []\n```\n\n\n\n\n pval\n ALCOHOL-1 ALCOHOL \n 0 0 0.390416\n 1 0.034171\n 1 0 0.035350\n 1 0.540063\n\n\n\n\n```python\nt[s.alcohol-1]\n```\n\n\n\n\n pval\n ALCOHOL-1 \n 0 1081\n 1 1465\n\n\n\n\n```python\nt / t[s.alcohol-1] # cond prob!\n```\n\n\n\n\n pval\n ALCOHOL-1 ALCOHOL \n 0 0 0.919519\n 1 0.080481\n 1 0 0.061433\n 1 0.938567\n\n\n\n\n```python\ntt = pstore[s.alcohol, s.tobacco]\ntt\n```\n\n\n\n\n pval\n ALCOHOL TOBACCO \n 0 0 240\n 1 903\n 1 0 179\n 1 1343\n\n\n\n\n```python\ntt / tt[s.alcohol]\n```\n\n\n\n\n pval\n ALCOHOL TOBACCO \n 0 0 0.209974\n 1 0.790026\n 1 0 0.117608\n 1 0.882392\n\n\n\n\n```python\ntt / tt[s.tobacco]\n```\n\n\n\n\n pval\n ALCOHOL TOBACCO \n 0 0 0.572792\n 1 0 0.427208\n 0 1 0.402048\n 1 1 0.597952\n\n\n\n\n```python\n\n```\n\n## Scrap place\n\n\n```python\nt = pstore[s.as_prescribed_opioid-1, s.heroin-1, s.heroin]\nt\n\n```\n\n\n\n\n pval\n AS PRESCRIBED OPIOID-1 HEROIN-1 HEROIN \n 0 0 0 927\n 1 172\n 1 0 99\n 1 936\n 1 0 0 249\n 1 33\n 1 0 15\n 1 115\n\n\n\n\n```python\ntt = t / t[s.as_prescribed_opioid-1, s.heroin-1] # cond prob!\ntt\n```\n\n\n\n\n pval\n AS PRESCRIBED OPIOID-1 HEROIN-1 HEROIN \n 0 0 0 0.843494\n 1 0.156506\n 1 0 0.095652\n 1 0.904348\n 1 0 0 0.882979\n 1 0.117021\n 1 0 0.115385\n 1 0.884615\n\n\n\n\n```python\ntt.tb\n```\n\n\n\n\n\n\n
\n \n \n | \n AS PRESCRIBED OPIOID-1 | \n HEROIN-1 | \n HEROIN | \n pval | \n
\n \n \n \n | \n 0 | \n 0 | \n 0 | \n 0.843494 | \n
\n \n | \n 0 | \n 0 | \n 1 | \n 0.156506 | \n
\n \n | \n 0 | \n 1 | \n 0 | \n 0.095652 | \n
\n \n | \n 0 | \n 1 | \n 1 | \n 0.904348 | \n
\n \n | \n 1 | \n 0 | \n 0 | \n 0.882979 | \n
\n \n | \n 1 | \n 0 | \n 1 | \n 0.117021 | \n
\n \n | \n 1 | \n 1 | \n 0 | \n 0.115385 | \n
\n \n | \n 1 | \n 1 | \n 1 | \n 0.884615 | \n
\n \n
\n
\n\n\n\n```\nAS PRESCRIBED OPIOID-1\tHEROIN-1\tHEROIN\t\n0\t0\t0\t0.843494\n0\t0\t1\t0.156506\n1\t0\t0\t0.882979\n1\t0\t1\t0.117021\n```\n\n\n```python\n0.117021 / 0.156506\n```\n\n\n\n\n 0.7477093529960512\n\n\n\n\n```python\n\n```\n\n\n```python\nprob_of_heroin_given_presc_op = 0.359223\nprob_of_heroin_given_not_presc_op = 0.519213\n\nprob_of_heroin_given_presc_op / prob_of_heroin_given_not_presc_op\n```\n\n\n\n\n 0.6918605658949217\n\n\n\n\n```python\nprob_of_heroin_given_not_presc_op / prob_of_heroin_given_presc_op\n```\n\n\n\n\n 1.4453779407220584\n\n\n\n# Potential Calculus Experimentations\n\n\n```python\n# survey_dir = '/D/Dropbox/others/Miriam/python/ProcessedSurveys'\ndf_store = DfStore(survey_dir + '/{}.xlsx')\nlen(df_store)\n```\n\n\n\n\n 119\n\n\n\n\n```python\ncstore = VarSetCountsStore(df_store)\nv = mk_pvar_struct(df_store, only_for_cols_in_all_dfs=True)\ns = mk_pvar_str_struct(v)\nf, df = cstore.df_store.head()\ndf.head(3)\n```\n\n\n\n\n\n\n
\n \n \n category | \n RURAL | \n SUBURBAN | \n URBAN/CITY | \n HOMELESS | \n INCARCERATION | \n WORK | \n SON/DAUGHTER | \n SIBLING | \n FATHER/MOTHER | \n SPOUSE | \n ... | \n HAL/LSD/XTC/CLUBDRUG | \n COCAINE/CRACK | \n METHAMPHETAMINE | \n AS PRESCRIBED OPIOID | \n NOT AS PRESCRIBED OPIOID | \n HEROIN | \n OTHER OPIOID | \n INJECTED | \n IN TREATMENT | \n Massachusetts | \n
\n \n age | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n | \n
\n \n \n \n 16 | \n 0 | \n 1 | \n 0 | \n 0 | \n 1 | \n 0 | \n 1 | \n 1 | \n 1 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n
\n \n 17 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 1 | \n 1 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n
\n \n 18 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 1 | \n 1 | \n 1 | \n 1 | \n 0 | \n ... | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n 0 | \n 0 | \n 0 | \n 0 | \n 1 | \n
\n \n
\n
3 rows \u00d7 29 columns
\n
\n\n\n\n\n```python\ncstore = VarSetCountsStore(df_store)\ncstore.mk_pvar_attrs()\n```\n\n\n```python\nfrom odus.dacc import DfStore, counts_of_kps, Dacc, plot_life_course, VarSetCountsStore, mk_pvar_struct, PotStore\npstore = PotStore(df_store)\npstore.mk_pvar_attrs()\np = pstore[v.homeless - 1, v.incarceration]\np\n```\n\n\n\n\n pval\n HOMELESS-1 INCARCERATION \n 0 0 1690\n 1 577\n 1 0 192\n 1 87\n\n\n\n\n```python\np / []\n```\n\n\n\n\n pval\n HOMELESS-1 INCARCERATION \n 0 0 0.663786\n 1 0.226630\n 1 0 0.075412\n 1 0.034171\n\n\n\n\n```python\npstore[v.incarceration]\n```\n\n\n\n\n pval\n INCARCERATION \n 0 1989\n 1 676\n\n\n\n\n```python\npstore[v.alcohol-1, v.loss_of_loved_one]\n```\n\n\n\n\n pval\n ALCOHOL-1 LOSS OF LOVED ONE \n 0 0 990\n 1 91\n 1 0 1321\n 1 144\n\n\n\n\n```python\ntw = pstore[v.tobacco, v.work]\nmw = pstore[v.marijuana, v.work]\naw = pstore[v.alcohol, v.work]\nw = pstore[v.work]\n\n```\n\n\n```python\nevid_t = Pot.from_hard_evidence(**{s.tobacco: 1})\nevid_m = Pot.from_hard_evidence(**{s.marijuana: 1})\nevid_a = Pot.from_hard_evidence(**{s.alcohol: 1})\nevid_a\n```\n\n\n\n\n pval\n ALCOHOL \n 1 1\n\n\n\n\n```python\naw\n```\n\n\n\n\n pval\n ALCOHOL WORK \n 0 0 431\n 1 712\n 1 0 448\n 1 1074\n\n\n\n\n```python\nw / []\n```\n\n\n\n\n pval\n WORK \n 0 0.329831\n 1 0.670169\n\n\n\n\n```python\n(evid_m * mw) / []\n```\n\n\n\n\n pval\n MARIJUANA WORK \n 1 0 0.350603\n 1 0.649397\n\n\n\n\n```python\n(evid_t * tw) / []\n```\n\n\n\n\n pval\n TOBACCO WORK \n 1 0 0.313001\n 1 0.686999\n\n\n\n\n```python\n(evid_a * aw) / []\n```\n\n\n\n\n pval\n ALCOHOL WORK \n 1 0 0.29435\n 1 0.70565\n\n\n\n# Extra scrap\n\n\n```python\n# from graphviz import Digraph\n# Digraph(body=\"\"\"\n# raw -> data -> count -> prob\n# raw [label=\"excel files (one per respondent)\" shape=folder]\n# data [label=\"dataframes\" shape=folder]\n# count [label=\"counts for any combinations of the variables in the data\" shape=box3d]\n# prob [label=\"probabilities for any combinations of the variables in the data\" shape=box3d]\n# \"\"\".split('\\n'))\n```\n\n\n```python\n\n```\n",
2020-08-03T14:45:27,746 "long_description_content_type": "text/markdown",
2020-08-03T14:45:27,746 "install_requires": [
2020-08-03T14:45:27,747 "py2store",
2020-08-03T14:45:27,748 "pandas",
2020-08-03T14:45:27,748 "numpy",
2020-08-03T14:45:27,749 "Pillow",
2020-08-03T14:45:27,749 "spyn",
2020-08-03T14:45:27,750 "matplotlib",
2020-08-03T14:45:27,751 "openpyxl",
2020-08-03T14:45:27,751 "argh"
2020-08-03T14:45:27,752 ],
2020-08-03T14:45:27,752 "description": "Tools to provide easy access to prepared data to data scientists that can't be asked.",
2020-08-03T14:45:27,753 "keywords": [
2020-08-03T14:45:27,753 "data",
2020-08-03T14:45:27,754 "data access",
2020-08-03T14:45:27,754 "drug use",
2020-08-03T14:45:27,755 "markov",
2020-08-03T14:45:27,756 "bayesian"
2020-08-03T14:45:27,757 ]
2020-08-03T14:45:27,757 }
2020-08-03T14:45:27,758 --------------------------------------------------------------------
2020-08-03T14:45:28,081 running bdist_wheel
2020-08-03T14:45:28,094 running build
2020-08-03T14:45:28,105 installing to build/bdist.linux-armv7l/wheel
2020-08-03T14:45:28,106 running install
2020-08-03T14:45:28,108 running install_egg_info
2020-08-03T14:45:28,150 running egg_info
2020-08-03T14:45:28,166 writing odus.egg-info/PKG-INFO
2020-08-03T14:45:28,174 writing dependency_links to odus.egg-info/dependency_links.txt
2020-08-03T14:45:28,178 writing requirements to odus.egg-info/requires.txt
2020-08-03T14:45:28,180 writing top-level names to odus.egg-info/top_level.txt
2020-08-03T14:45:28,188 reading manifest file 'odus.egg-info/SOURCES.txt'
2020-08-03T14:45:28,195 writing manifest file 'odus.egg-info/SOURCES.txt'
2020-08-03T14:45:28,200 Copying odus.egg-info to build/bdist.linux-armv7l/wheel/odus-0.0.6-py3.7.egg-info
2020-08-03T14:45:28,228 running install_scripts
2020-08-03T14:45:28,440 creating build/bdist.linux-armv7l/wheel/odus-0.0.6.dist-info/WHEEL
2020-08-03T14:45:28,445 creating '/tmp/pip-wheel-1t5d36f_/odus-0.0.6-py3-none-any.whl' and adding 'build/bdist.linux-armv7l/wheel' to it
2020-08-03T14:45:28,456 adding 'odus-0.0.6.dist-info/METADATA'
2020-08-03T14:45:28,460 adding 'odus-0.0.6.dist-info/WHEEL'
2020-08-03T14:45:28,462 adding 'odus-0.0.6.dist-info/top_level.txt'
2020-08-03T14:45:28,464 adding 'odus-0.0.6.dist-info/RECORD'
2020-08-03T14:45:28,466 removing build/bdist.linux-armv7l/wheel
2020-08-03T14:45:28,592 Building wheel for odus (setup.py): finished with status 'done'
2020-08-03T14:45:28,596 Created wheel for odus: filename=odus-0.0.6-py3-none-any.whl size=6690 sha256=1e9ed08a2db3bd732081b27fc0f51b1663c70f2f9d791af3cca9d7393d8bb542
2020-08-03T14:45:28,597 Stored in directory: /tmp/pip-ephem-wheel-cache-c2ro9x8m/wheels/76/19/75/cb0c81022e84cb532d1f95dbb271f65932072bfdc33f672867
2020-08-03T14:45:28,601 Successfully built odus
2020-08-03T14:45:28,608 Removed build tracker: '/tmp/pip-req-tracker-qi8cr40e'