SpeedTree uses a grid/cell-based culling system that is efficient at culling very large forests (millions of instances). There are three broad steps involved in forest-level culling:
The SDK is set up for the client application to stream in those tree instances that are visible and will automatically flush those instances that are no longer visible.
The SDK organizes the world as a series of cells. As the camera moves, cells go in and out of visibility. As cells become visible, the SDK will provide a list of those cells that need to have their populations streamed in. Hence, the SDK will perform most efficiently when the client has the tree instances already organized by cells so that the data might be passed into the SDK without further on-the-fly processing.
In the reference application, this is done using our example CMyInstancesContainer, defined in MyPopulate.h/cpp. In videogames, this class would likely be used to store all of the tree instances for a given level. CMyInstancesContainer is initialized by passing all the base trees and instances in once, and then calling SplitIntoCells(), which organizes the input data in a way that will efficiently feed into the SDK. Note that you are free to use any type of data structure here that best suits your needs. CMyInstancesContainer is merely an example that works efficiently for the needs of the SDK's reference application (although not in as heap-friendly a manner as it could).
There are several classes and structures you'll need to get started:
These objects are detailed here. You'll need one CVisibleInstances class per view and it should persist as long as the world does. The quantities of the other classes will be determined by how your world is populated.
As shown in CMyApplication::StreamTreePopulation(), the general procedure followed for each frame where the camera has moved is outlined below. The whole procedure happens very quickly, even for densly populated forests.
Per frame, the general procedure is as follows (please see CAppplication::StreamTreePopulation() for details on our implementation).
To update the instance buffers for the billboard geometry, invoke CVisibleInstancesRI::CopyBillboardInstancesToGpu() for each of the base trees in the scene. Likewise, invoke CVisibleInstancesRI::UpdateGrassInstanceBuffers() for each base grass in the scene. Sometimes these instance VB updates will wait on the GPU because the buffer is busy. The SDK double buffers the instance buffers to help avoid this.
The reference application shows exactly how run-time culling and LOD computation works for 3D trees, billboards, and grass in the CMyPopulate class as called from CMyApplication::Cull().
|Note: This outlines how to cull tree models primarily, but is very similar to how grass is handled, detailed here.|
The SpeedTree SDK is often integrated into world builders which means dynamic forest populations, particularly when the camera hasn't changed which is what normally triggers the SDK's streaming/culling code. In this case, when you want to manually trigger a new population event, simply call CVisibleInstances::NotifyOfPopulationChange() and call your streaming/population function again (the 8 steps outlined above). NotifyOfPopulationChange() clears out any history of cell visibility and the new cell list will contain all the visible cells so they can be populated again.
While the list above outlines the general approach needed for efficient and accurate tree streaming and culling, there are several other important points.
The entire streaming and culling system is based on organizing trees into evenly-spaced cells. It's important for performance reasons that the application can quickly populate the cells in step 6 above. This mostly means not wasting cycles during a render loop determining which instances are going into which cells. We provide the example class CMyInstancesContainer, defined in the reference application in MyPopulate.h/cpp. It is not sophisticated in its dynamics or heap usage, but it does show how to quickly and easily organize an existing population of base trees and instances into cells so that they can be quickly passed into the SDK.
Step 3 above explains that a cell's height extents must be provided. Specifically, these extents are from the bottom of the lowest tree in the cell to the top of the tallest. Again, it makes sense to have this data available ahead of time and not waste cycles determining it in the render loop.
The cell size for 3D trees (set in the SDK by using CVisibleInstances::SetCellSize()) is an important parameter for performance. The SDK first determines the number of visible cells. The larger the cells, the more quickly this determination is made. Once the cells are determined though, for any that intersect the frustum and aren't at a billboard distance, the SDK must loop through the 3D instances to determine their individual visibility. Having smaller cells reduces this time. The reference application, whose world units are feet, uses a default cell size of 1200.0 feet (set in the SFC file, world::3dtree_cell_size parameter), which we believe strikes a good balance for our example forest.
To lift the hood a bit on how the SDK handles converting the user's arbitrary instance arrays into lists of 3D trees and billboards (two completely different representations):
The organization of multiple types of tree instances into an arbitrary collection of cells can easily lead to a great number of heap allocations if you're not careful. We leave it to the user to implement their own app-side storage data structure, though our reference application's example CMyInstancesContainer is not an ideal steward of the heap.
To control the SDK's heap behavior, be sure to read about the Reserves System.