User Tools

Site Tools


Outline

SpeedTree uses a grid/cell-based culling system that is efficient at culling very large forests (millions of instances). There are three broad steps involved in forest-level culling:

  1. Setting the camera parameters.
  2. Calling the cull functions which will provide the cull results.
  3. Reporting the tree instances that appear in the provided cells to the SDK.

Overview

The SDK is set up for the client application to stream in those tree instances that are visible and will automatically flush those instances that are no longer visible.

The SDK organizes the world as a series of cells. As the camera moves, cells go in and out of visibility. As cells become visible, the SDK will provide a list of those cells that need to have their populations streamed in. Hence, the SDK will perform most efficiently when the client has the tree instances already organized by cells so that the data might be passed into the SDK without further on-the-fly processing.

In the reference application, this is done using our example CMyInstancesContainer, defined in MyPopulate.h/cpp. In videogames, this class would likely be used to store all of the tree instances for a given level. CMyInstancesContainer is initialized by passing all the base trees and instances in once, and then calling SplitIntoCells(), which organizes the input data in a way that will efficiently feed into the SDK. Note that you are free to use any type of data structure here that best suits your needs. CMyInstancesContainer is merely an example that works efficiently for the needs of the SDK's reference application (although not in as heap-friendly a manner as it could).


Getting Started

There are several classes and structures you'll need to get started:

  • Class CView: CView encapsulates many of the common values associated with a particular view like projection and modelview matrices and near and far clipping planes, but it also includes code for deriving values like camera azimuth and pitch, and frustum data.
  • Class CTree: These are known as “base trees.” There is one of these objects for every SRT file loaded in a scene. It contains a complete definition of a single tree model.
  • StructureSInstance: Each base tree may have one or more instances, each with unique position, orientation and scale. STreeInstance derives from SInstance, adding culling-specific data.
  • Class CCell: The forest is divided into a series of evenly-spaced cells. Cells contain SInstances.
  • Class CVisibleInstances: The central class for instance culling and streaming. It contains the functions for determining which cells are visible, instance culling, and 3D instance LOD computations.
  • Structure S3dTreeInstanceLod: When the SDK provides a list of 3D instance (i.e. 3D means they're rendered as 3D geometry, not billboards), it will do so using this structure which contains an SInstance pointer as well as important LOD data.

These objects are detailed here. You'll need one CVisibleInstances class per view and it should persist as long as the world does. The quantities of the other classes will be determined by how your world is populated.


Streaming and Culling

As shown in CMyApplication::StreamTreePopulation(), the general procedure followed for each frame where the camera has moved is outlined below. The whole procedure happens very quickly, even for densly populated forests.

Per frame, the general procedure is as follows (please see CAppplication::StreamTreePopulation() for details on our implementation).

  1. Set View: Make sure the CView object has been updated with the current view. Note that normally one CView object is used per view (e.g. one for the main camera, another for shadows) and they commonly persist with the world. Sometimes there are challenges aligning SpeedTree's expected view matrices with client application's.

  2. Rough Cull: Call CVisibleInstances::RoughCullCells() with the current CView. Because the SDK does not store the entire tree instance population for a level (it stores only the visible cells), it cannot know before hand the complete extents of the cells. Specifically, it knows the width and height of the cells because they're in a grid layout, but it cannot know the height on a rolling terrain without input from the application. RoughCullCells() will return a long list of cells that would be visible if the grid were composed of infinitely tall cells.

  3. Set Rough Cell Extents: Your applications provides the rough cell extents. The CMyInstancesContainer class holds complete extents for each cell (including height), and the reference application looks up these extents for the rough cull list and sets them for each cell. Use CCell::GetExtents( ) on each of the rough culled cells to get its x/y extents.

  4. Fine Cull: Call CVisibleInstances::FineCullCells() with the current CView. This give the SDK a chance to use the updated cell extents to give an exact list of those cells that are within the view frustum.

  5. Get Newly-Visible Cells: Once CVisibleInstances::FineCullCells() has been called, a list of newly visible cells can be retrieved from the class by calling CVisibleInstances::NewlyVisibleCells(). Newly visible cells are those cells that were not in the frustum the last time FineCullCells() was called.

  6. Populate Cells With Instances: Populate newly visible cells by invoking CCell::SetTreeInstances() on each one. This function takes both a list of base trees and a list of instances, but there are a couple of notes in using this function:

    • The instances are passed as pointers to the app-side instances and are passed as a single array where the instances for the various base trees are concatonated together.

    • The instances must be concatonated in the same order as the base tree pointers are passed to CCell::SetTreeInstances().

  7. Compute LOD: Once the class has a list of all the instances withing the frustum, it needs to separate those that will render in full 3D from those that will render as billboards. Do this by calling CVisibleInstances::Cull3dTreesAndComputeLod(). When it executes, it will also determine the LOD states for the 3D trees in an array of S3dTreeInstanceLod objects.

  8. * Optional * – Update Instance Lists: The SDK Render Interface library derives the CVisibleInstancesRI class from CVisibleInstances, allowing it to add rendering components to it, specifically instancing-based rendering code for 3d trees, grass, and billboards. As such the client app can control when this derived class updates its instance list vertex buffers by calling Cull3dTreesAndUpdateInstanceBuffers(). Note that if you call this function, it will automatically call CVisibleInstances::Cull3dTreesAndComputeLod(), so you can skip step 7 above.

To update the instance buffers for the billboard geometry, invoke CVisibleInstancesRI::CopyBillboardInstancesToGpu() for each of the base trees in the scene. Likewise, invoke CVisibleInstancesRI::UpdateGrassInstanceBuffers() for each base grass in the scene. Sometimes these instance VB updates will wait on the GPU because the buffer is busy. The SDK double buffers the instance buffers to help avoid this.

The reference application shows exactly how run-time culling and LOD computation works for 3D trees, billboards, and grass in the CMyPopulate class as called from CMyApplication::Cull().

Note: This outlines how to cull tree models primarily, but is very similar to how grass is handled, detailed here.

World Builder Considerations

The SpeedTree SDK is often integrated into world builders which means dynamic forest populations, particularly when the camera hasn't changed which is what normally triggers the SDK's streaming/culling code. In this case, when you want to manually trigger a new population event, simply call CVisibleInstances::NotifyOfPopulationChange() and call your streaming/population function again (the 8 steps outlined above). NotifyOfPopulationChange() clears out any history of cell visibility and the new cell list will contain all the visible cells so they can be populated again.


Other Considerations

While the list above outlines the general approach needed for efficient and accurate tree streaming and culling, there are several other important points.

App-side Instance Data

The entire streaming and culling system is based on organizing trees into evenly-spaced cells. It's important for performance reasons that the application can quickly populate the cells in step 6 above. This mostly means not wasting cycles during a render loop determining which instances are going into which cells. We provide the example class CMyInstancesContainer, defined in the reference application in MyPopulate.h/cpp. It is not sophisticated in its dynamics or heap usage, but it does show how to quickly and easily organize an existing population of base trees and instances into cells so that they can be quickly passed into the SDK.

Cell Extents
Step 3 above explains that a cell's height extents must be provided. Specifically, these extents are from the bottom of the lowest tree in the cell to the top of the tallest. Again, it makes sense to have this data available ahead of time and not waste cycles determining it in the render loop.

Cell Size

The cell size for 3D trees (set in the SDK by using CVisibleInstances::SetCellSize()) is an important parameter for performance. The SDK first determines the number of visible cells. The larger the cells, the more quickly this determination is made. Once the cells are determined though, for any that intersect the frustum and aren't at a billboard distance, the SDK must loop through the 3D instances to determine their individual visibility. Having smaller cells reduces this time. The reference application, whose world units are feet, uses a default cell size of 1200.0 feet (set in the SFC file, world::3dtree_cell_size parameter), which we believe strikes a good balance for our example forest.

3D Trees vs. Billboards

To lift the hood a bit on how the SDK handles converting the user's arbitrary instance arrays into lists of 3D trees and billboards (two completely different representations):

  • 3D Trees: The cells are separated into those that fall into and out of the 3D LOD range for a given base tree. For those that fall within, each tree may be frustum culled (depending on wether the cell intersects the frustum or is completely enclosed), and then has a S3dTreeInstanceLod object computed and added to a list.
  • Billboards: The billboard population is built directly from the visible cells list. The entire population of every cell is copied into an instance list and rendered. So, trees that appear in full 3D directly in front of the camera (that would never be rendered with a billboard), still send a billboard down the render pipeline. The billboard vertex shader is able to quickly determine that a billboard close to the camera should be invisible and scales it to zero. We have found that this is much more efficient than using the CPU to determine the exact list of visible billboards in any given view.

Heap Fragmentation Control

The organization of multiple types of tree instances into an arbitrary collection of cells can easily lead to a great number of heap allocations if you're not careful. We leave it to the user to implement their own app-side storage data structure, though our reference application's example CMyInstancesContainer is not an ideal steward of the heap.

To control the SDK's heap behavior, be sure to read about the Reserves System.