gll89的个人博客分享 http://blog.sciencenet.cn/u/gll89

博文

common usage of Pandas

已有 1625 次阅读 2019-4-5 16:05 |个人分类:Python|系统分类:科研笔记| Pandas

  1. read_csv(filepath, usecols)

    e.g., subnames_no_MPR = pd.read_csv(filepath, delimiter=',', usecols = ['Subject ID', 'Age']);

    usecols: Return a subset of the columns. If list-like, all elements must either be positional (i.e. integer indices into the document columns) or strings that correspond to column names provided either by the user in names or inferred from the document header row(s). For example, a valid list-like usecols parameter would be [0, 1, 2] or ['foo','bar', 'baz']. Element order is ignored, so usecols=[0, 1] is the same as [1, 0].

    https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

  2. DataFrame.drop_duplicates(subset, keep)

    e.g., subs_MPR = subnames_MPR.drop_duplicates(subset = ['Subject ID', 'Age'], keep = 'first');


    subset : column label or sequence of labels, optional

    Only consider certain columns for identifying duplicates, by default use all of the columns

    keep : {‘first’, ‘last’, False}, default ‘first’

        first : Drop duplicates except for the first occurrence.

        last : Drop duplicates except for the last occurrence.

              False : Drop all duplicates.

    http://pandas.pydata.org/pandas-docs/version/0.17/generated/pandas.DataFrame.drop_duplicates.html

    https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-in-python-pandas




https://blog.sciencenet.cn/blog-1969089-1171591.html

上一篇:Keyboard shortcuts for the Jupyter Notebook
下一篇:[转载]Residual blocks--Building blocks of ResNet
收藏 IP: 218.0.4.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-11-30 09:44

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部