博文

common usage of Pandas

已有 1673 次阅读 2019-4-5 16:05 |个人分类:Python|系统分类:科研笔记| Pandas

read_csv(filepath, usecols)
e.g., subnames_no_MPR = pd.read_csv(filepath, delimiter=',', usecols = ['Subject ID', 'Age']);
usecols: Return a subset of the columns. If list-like, all elements must either be positional (i.e. integer indices into the document columns) or strings that correspond to column names provided either by the user in names or inferred from the document header row(s). For example, a valid list-like usecols parameter would be [0, 1, 2] or ['foo','bar', 'baz']. Element order is ignored, so usecols=[0, 1] is the same as [1, 0].
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
DataFrame.drop_duplicates(subset, keep)
e.g., subs_MPR = subnames_MPR.drop_duplicates(subset = ['Subject ID', 'Age'], keep = 'first');

subset : column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by default use all of the columns
keep : {‘first’, ‘last’, False}, default ‘first’
first : Drop duplicates except for the first occurrence.
last : Drop duplicates except for the last occurrence.
False : Drop all duplicates.
http://pandas.pydata.org/pandas-docs/version/0.17/generated/pandas.DataFrame.drop_duplicates.html
https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-in-python-pandas

转载本文请联系原作者获取授权，同时请注明本文来自高琳琳科学网博客。
链接地址：https://blog.sciencenet.cn/blog-1969089-1171591.html

上一篇：Keyboard shortcuts for the Jupyter Notebook
下一篇：[转载]Residual blocks--Building blocks of ResNet

收藏 IP: 218.0.4.*| 热度|

数据加载中...

返回顶部

扫一扫，分享此博文