aggregation

4月 052022
 

Have you ever heard something referred to as the bee’s knees? Most likely the person uttering that expression meant that it was truly amazing and extraordinary. Maybe you stopped and pondered the origin of the phrase. Well wonder no more! In the 1920s in the United States, people were obsessed with rhyming and anthropomorphism (giving human traits or attributes to animals). This era saw the introduction of multiple animal-centric idioms: the ant’s pants, the cat’s pajamas, and (you guessed it) the bee’s knees. Most likely this idiom refers to the pollen basket on the lower section of a worker bee’s legs (where the tibia and the tarsi connect). When bees are out foraging, they carry all the pollen they find in the pollen baskets on their legs. If you ever see a bee with lots of pollen on her legs, you know she’s been working hard… one might even say she’s an amazing and extraordinary worker!

SAS Visual Analytics has a lot of features that you might call the bee’s knees, but one of the most amazing and extraordinary features is the AggregateTable operator. This operator, introduced in SAS Visual Analytics 8.3, enables you to perform aggregations on data crossings that are independent of (or changed from) the data in your objects. This means you can use this operator to compare aggregations for different groups of data in one single object.

To illustrate the AggregateTable operator in action (and to keep with the theme), let’s consider an example.

I’m a hobby beekeeper in Texas. This means that I own and maintain a few hives in my backyard from which I collect honey and wax to sell to family and friends. I’m interested in learning about honey production in the United States for different years. I’m pretty sure I’m not the biggest honey producer in my state (or even my county), but I want to look at different crossings of total production (by state and year, by county and year, and by state).

The United States Department of Agriculture’s National Agricultural Statistics Service has Census data on honey production (measured in pounds) for all counties in the United States for 2002, 2007, 2012, and 2017.

Type of Aggregation: Add

To start, I would like to calculate the total production for each state (Maximum by State) and display the total production for the state that produced the most honey in that year. For example, in 2002 California produced the most honey of any state (15,585,925 pounds) and in 2017 North Dakota produced the most honey of any state (24,296,437 pounds).

Because the table contains details by county and I’m interested in the total production by state, I will either need to create an aggregated data source that contains details by state, or I will need to use the AggregateTable operator. Since this post is about the wonders of the AggregateTable operator, let’s focus on that.

The AggregateTable operator requires five parameters:

Parameter Description
Aggregation- Aggregated The aggregation applied to the aggregated item when it is used in an object that displays fewer group-by crossings than the table in the expression.
Aggregation- Measure The aggregation applied to the measure in the inner table context.
Type of aggregation The type of aggregation that is performed. Values are Fixed, Add, or Remove.
Categories The list of categories used to alter the data crossing for the aggregation.
Measure The measure that is aggregated. A Table operator can be added as the measure to perform a nested aggregation.

It also has a nested operator, Table, that creates an intermediate table defined by the Type of aggregation, Categories, Measure, and Aggregation- Measure parameters.

For this example, I want to use a Key value object to see the total production values for the state that produced the most honey in each year. The object will contain one group-by crossing (or category): Year. The calculation, however, will need to use two group-by crossings to determine the highest producing state for each year: Year and State. Therefore, the Aggregation-Measure is _Sum_ (to calculate the total production by state), the Type of aggregation is Add (because I want to add a crossing for State to the calculation), Categories is set to State, and Measure is Production (lb).

The intermediate table will contain one row for each state and year and contain total production values.

Then, for each year, I want the highest production value (for 2002, 15,585,925 pounds). Therefore, the Aggregation- Aggregated parameter should be _Max_ to grab the maximum values for each year from the intermediate table.

Then, I can display the Maximum by State with Year in a Key value object.

Note: Beginning in SAS Visual Analytics 2021.1.1 (May 2021), a new role is available for the Key value object, Lattice category. This enables you to display a key value object for each distinct value of a category data item (in this example, Year).

Now that I have a data item that contains the production amount for the highest producing state for each year, I can create some more complex calculations, like the percentage of total production for each year by the state that had the highest production. This will enable me to see if the highest producing state is doing all the heavy lifting or if all the states are producing a near-equal amount.

Type of Aggregation: Remove

The Type of aggregation parameter also enables you to remove categories from the calculation. For this example, suppose I want to compare the production in each county to the production from the highest producing county in that state (Maximum by County). I want to use a list table to compare these values.

The object will contain three group-by crossings: Year, State, and County. The calculation, however, will only use two group-by crossings to determine the highest producing county in each state for each year: Year and State. Therefore, the Aggregation-Measure is _Max_ (to calculate the maximum production in each state), the Type of aggregation is Remove (because I want to remove a crossing for County from the calculation), Categories is set to County, and Measure is Production (lb).

The intermediate table will contain one row for each state and year and contain the production values for the county in that state with the highest production. Notice that for this table, the aggregation for Production was set to Maximum to show the maximum production for each state.

Because the number of groupings in the object (County, Year, and State) is not fewer than the number of groupings in the calculation (Year and State), the Aggregation-Aggregated parameter is not applied and can be set to any value.

Then, I can display Maximum by County in a list table with total production by county to easily compare each county’s production with the production of the county in that state that had the highest production.

Now, I can calculate each county’s difference from the county with the highest production in that state for each year.

Type of Aggregation: Fixed

The Type of aggregation parameter also enables you to set a fixed category for the calculation. For this example, suppose I want to compare the production in each county to the total production across all years by state (Total by State). I want to use a list table to compare these values.

The object will contain three group-by crossings: Year, State, and County. The calculation, however, will only use one group-by crossing to determine the total production by state across all years: State. Therefore, the Aggregation-Measure is _Sum_ (to calculate the total production by state across all years), the Type of aggregation is Fixed (because I want to fix the crossing to State for the calculation), Categories is set to State and Measure is Production (lb).

The intermediate table will contain one row for each state and total production values across all years.

Because the number of groupings in the object (County, Year, and State) is not fewer than the number of groupings in the calculation (State), the Aggregation-Aggregated parameter is not applied and can be set to any value.

Then, I can display Total by State in a list table with total production by county to easily compare each county’s production with total production in the state across all years.

I can even compare total state values for each year with total production in that state across all years.

Then, I can calculate the share of total production produced each year.

For more information about how to create advanced data items and filters for your SAS Visual Analytics reports, including examples that use different types of operators, check out my book Interactive Reports in SAS Visual Analytics: Advanced Features and Customization.

Bees are important contributors in pollinating most of the food that you eat and are in desperate need of your help! There are many ways you can help the honeybee population thrive:

  • Become a beekeeper
  • Plant a garden for all pollinators (including bumblebees, butterflies, bats, and moths)
  • Reduce or eliminate pesticide use
  • Support your local beekeepers by buying local honey
  • Contact a beekeeping group if you see a swarm
  • Volunteer as a citizen data scientist by taking pictures and posting information on the iNaturalist app
  • Create a bee bath for those hot summer days
  • Build homes for native bees

 

The AggregateTable operator… It’s the bee’s knees! was published on SAS Users.

6月 052014
 

For many years enterprise risk management (ERM) was more of a concept than a reality, a visionary strategy that spanned every department within an organisation at every level. However, in recent years, ERM programmes have been indisputably implemented by organisations of all sizes, with treasury central to this process. This article asks what constitutes an effective ERM programme, and how does a company go about creating a solid risk framework with treasury now playing a vital role?

ERM is no longer aligned to an academic-based vision with only limited application in the real world. Investment in ERM infrastructure and analytics is now viewed as integral to the management decision process, especially in financial services firms. A successful bank must conduct stress testing and create a capital planning approach, integrating risk and financial data. While regulation plays a part in this change, it is only a fraction of what banks must do to remain profitable.

An ERM vision must have an IT infrastructure and risk culture that support cross-functional collaboration, such as the partnership of risk and finance. Collaboration requires understanding all portfolio risks and how to manage those risks to meet the long-term goals of the bank. While a chief financial officer (CFO) is responsible for balance sheet growth, the chief risk officer (CRO) is in charge of risk control objectives, such as minimising balance sheet risk and calibrating this with the bank’s risk appetite.

The CRO must also avoid balance sheet exposure at all times. For the treasury function, Basel III regulations demand comprehensive assessments that affect businesses. For example, banks may examine each instrument in terms of its contribution to the balance sheet, leverage, liquidity, profit and loss (P&L), regulatory and economic capital, and risk-adjusted profitability. Getting a holistic view for different portfolios from one place is both a big data and systems challenge for banks.

Re-examining Capital Planning

New approaches to capital planning have elevated the role of treasury beyond solvency and into strategic planning, helping banks evaluate long-term capital under forward-looking projections as well as stressed scenarios. Regulation has focused attention back on capital efficiency and allocating increasingly scarce capital to businesses that outperform on a risk-adjusted basis.

Data management and quality remain key challenges to the delivery of these new approaches. Specifically for asset liability management (ALM) and liquidity risk, data is necessary to maintain a comprehensive balance sheet. For a large bank, the balance sheet can comprise trillions of assets, where the data requirement for asset liability management is huge since analysis must be performed on all assets and liabilities.

The data challenge is obvious when integrating information from the risk side (for regulatory and economic capital) with the finance side (for financial statements, general ledger accounts, liquidity, Generally Accepted Accounting Principles and International Financial Reporting Standards (GAAP/IFRS) compliance).

Historically, these departments operated independently and had no business incentive to work together. But imperative regulations - such as balance sheet stress testing, IAS 39, IAS 9 and Dodd-Frank - are now motivating risk and finance groups to work together for compliance. An integrated view of risk and finance data conveys far more information than a disjointed view. Furthermore, banks can use that information for effective future strategic planning - the main goal when optimizing capital planning and management.

An ERM architecture must have the ability to process each data item required. The best practice is to establish automatic metadata management that tracks the lineage of all models, stress scenarios, and data outputs.

The goal for ERM is to calculate analyses only once and use the results in various applications. The planning systems evolve from pure recipients of data into new suppliers of data. Within a ‘data universe’, these systems must provide information of overriding interest (e.g., cash flow plans) as a central service. When implemented consistently, the result is an integrated bank control architecture with a single point of truth.

Stepping Up Stress Testing

Another powerful ERM tool is the increasing use of stress testing to help business executives gain enterprise and business unit level oversight.

Financial institutions (FIs) are implementing enterprise stress testing systems that can be used for various analyses and reporting. Stress testing a bank’s portfolio requires several dimensions. Macroeconomic factors must be linked to the portfolio’s loan performance (credit risk), liquidity profile and market risk. This is a difficult modeling exercise that banks were not previously required to do.

Since the development of Basel II stress testing has been hotly debated, and, in many instances, inadequately implemented. The topic is relevant both from a regulatory perspective and for the internal risk assessment in the context of the internal capital adequacy assessment process (ICAAP). The Capital Requirements Directive (CRD IV) consultations as the European implementation of Basel III have already addressed the existing shortcomings both in stress testing and back testing.

Stress tests represent only one form of simulation that highlights the effects of extremely negative developments. However, the simulation capability is important for applications that go far beyond this, especially for opportunity identification such as economic development planning and control. A simulation can, and should, make it possible to evaluate the effects of decisions.

In stress tests, simulating different environmental conditions (scenarios) requires the financial position of the bank to be investigated. Here, negative economic developments are simulated by means of risk parameters, to read potential portfolios. The goal is early detection of serious changes in the risk structure to estimate bank stability, especially in periods of crisis. For planning timely countermeasures, both the executive board and the supervisory board need to know and understand the risks.

To use stress tests continuously and regularly as a management tool, a process must transparently document the assumptions, models, and the results. The risk process must also be established across departments so that the entire bank’s risk can be determined, which influences the calculation of the risk-bearing capacity.

Benefits of ERM

ERM implementations attempt to develop an integrated method of approaching risk. This means integration of the different approaches to risks and solutions, as well as approaches to identify and manage risk effectively. ERM can provide firms with the means to decide how much uncertainty (risk) to accept while achieving growth, with new stress testing and capital planning capabilities playing an increasingly important role in senior management decision making.

The key to meeting ERM objectives is to develop a risk infrastructure to obtain and then maintain a holistic view of risk while dealing with ever-evolving data, complexity in measurement and timely management reporting. However, the reality for many firms is that following such an approach depends heavily on how the IT infrastructure has grown. An integrated view of risk and finance data conveys far more information than a disjointed view. Furthermore, banks can use that information for effective future strategic planning – the main goal when optimizing capital planning and management.

How has your approach to capital planning changed in light of today’s regulatory and financial environment?

Learn more about leading practices in capital planning and risk adjusted performance management in the SAS whitepaper  Capital Management for Banking: Dawn of a New Era and visiting us at SAS Capital Planning & Management.

- by Larry Roadcap and Dave Rogers, SAS

tags: aggregation, capital management, cro, erm, icaap, risk, risk analytics, Risk Management, stress testing, treasury