未加星标

Great tip for dynamic data selection using SAS Viya and Python

字体大小 | |
[开发(python) 所属分类 开发(python) | 发布者 店小二03 | 时间 2018 | 作者 红领巾 ] 0人收藏点击收藏

Great tip for dynamic data selection using SAS Viya and Python
This post was also written by SAS' Xiangxiang Meng.

You can communicate with various clients (SAS, python, Lua, Java, and REST) in the same place using SAS Cloud Base Analytics Services (CAS) in SAS Viya.

The SAS Scripting Wrapper for Analytics Transfer (SWAT) package is a Python interface to CAS. With this package, you can load data into memory and apply CAS actions to transform, summarize, model and score the data. You can still retain the ease-of-use of Python on the client side to further post process CAS result tables.

But before you can do any analysis in CAS you need some data to work with, and a way to get to it. There are two components to data access in CAS: caslibs and CAS tables. Caslibs are definitions that give access to a resource that contains data. When you want to analyze data from a caslib resource, you load the data into a CAS table. A CAS table contains columns of data and information about the data in the columns.

CAS Action sets include many of the methods defined by Pandas DataFrames. So, if you are familiar with the Pandas Data Analysis Library , CAS actions should come naturally. CAS enables you to subset tables using Python expressions. Using Python, you can create conditions that are based on the data pulled, instead of creating the conditions yourself. SAS will use the information you want pulled to determine which rows to select.

For example, rather than using fixed values of rows and columns to select data, SAS can create conditions based on the data in the table to determine which rows to select. This is done using the same syntax as DataFrames . CASColumn objects support Python’s various comparison operators and builds a filter that subsets the rows in the table. You can then use the result of that comparison to index into a CASTable . It sounds much more complicated than it is, so let’s look at an example.

The examples below are from the Iris flower data set , which is available in the SASHELP library, in all distributions of SAS. The listed code and output are produced using the IPython interface but can be employed with Jupyter Notebook just as easily.

If we want to get a CASTable that only contains values where petal_length is greater than 7, we can use the following expression to create our filter.


Great tip for dynamic data selection using SAS Viya and Python

Behind the scenes, this expression creates a computed column that is used in a WHERE expression on the CASTable . This expression can then be used as an index value for a CASTable . Indexing this way essentially creates a boolean mask. Wherever the expression values are true, the rows of the table are returned. Wherever the expression is false, the rows are filtered out.


Great tip for dynamic data selection using SAS Viya and Python

These two steps are more commonly done in one line.


Great tip for dynamic data selection using SAS Viya and Python

We can further filter rows out by indexing another comparison.


Great tip for dynamic data selection using SAS Viya and Python

Comparisons can be joined using the bitwise comparison operators & (and) and | (or) . You do have to be careful with these though due to the operator precedence. Bitwise comparison has a higher precedence than comparisons such as greater-than and less-than, so you need to wrap your comparisons in parentheses.


Great tip for dynamic data selection using SAS Viya and Python

In all cases, we are not changing anything about the underlying data in CAS. We are simply constructing a query that is executed with the CASTable when it is used as the parameter in a CAS action. You can see what is happening behind the scenes by displaying the resulting CASTable objects.


Great tip for dynamic data selection using SAS Viya and Python

You can also do mathematical operations on columns with constants or other columns within your comparisons.


Great tip for dynamic data selection using SAS Viya and Python

The list of supported operations is shown in the table below.


Great tip for dynamic data selection using SAS Viya and Python

The supported comparison and operators are shown in the following table.


Great tip for dynamic data selection using SAS Viya and Python

As you can see in the tables above, it is possible to do comparisons on character columns as well. This includes using many of Python’s string methods on the column values. These are accessed using the str attribute of the column, just like in DataFrames.


Great tip for dynamic data selection using SAS Viya and Python

This easy syntax allows the Python client to manipulate data much easier when working in SAS Viya.

Another great tip? The Python client allows you to manipulate data on the fly, without moving or copying the data to another location. Creating computed columns allows you to speed up the wrangling of data, while giving you options for how want to get there.

Want to learn more great tips about integrating Python with SAS Viya? Check out Kevin Smith and Xiangxiang Meng’s SAS Viya: The Python Perspective to learn how Python can be intergraded into SAS Viya ―and help you manipulate data with ease.

本文开发(python)相关术语:python基础教程 python多线程 web开发工程师 软件开发工程师 软件开发流程

代码区博客精选文章
分页:12
转载请注明
本文标题:Great tip for dynamic data selection using SAS Viya and Python
本站链接:https://www.codesec.net/view/610694.html


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 开发(python) | 评论(0) | 阅读(86)