[譯]facebook Android平臺上動態列表的記憶體優化(Memory optimization for feeds on Android)

英文原文地址 "Memory optimization for feeds on Android" 讀後感在Java中HashSet只能存放繼承自Objcet的對象，這中情況下“基本數據類型”轉化為繼承自Object的( 、`Long`等)會產生很多中間Object對象，占用過多的記憶體，從而引發垃 ...

英文原文地址Memory optimization for feeds on Android

讀後感

在Java中HashSet只能存放繼承自Objcet的對象，這中情況下“基本數據類型”轉化為繼承自Object的(Integer、Long等)會產生很多中間Object對象，占用過多的記憶體，從而引發垃圾回收，造成系統卡頓。

下邊是原文及翻譯

Millions of people use Facebook on Android devices every day, scrolling through News Feed, Profile, Events, Pages, and Groups to interact with the people and information they care about. All of these different feed types are powered by a platform created by the Android Feed Platform team, so any optimization we make to the Feed platform could potentially improve performance across our app. We focus on scroll performance, as we want people to have a smooth experience when scrolling through their feeds.

每天有數百萬的用戶，在Android平臺上使用Facebook，通過滑動 "新聞動態"、"用戶資料"、"Events"、"Pages"和"群組"動態列表來與他們關心的人和信息互動。所有這些不同的動態類型是由Android動態平臺組提供的，因此我們對動態列表的任何優化，都有可能提高Fackbook這個app的性能。我們專註於動態列表滑動時的性能，當用戶滑動動態列表時，我們希望可以為用戶提供一種流暢的用戶體驗。

To help us achieve that, we have several automatic tools that run performance tests on the Feed platform, across different scenarios and on different devices, measuring how our code performs in runtime, memory use, frame rate, and more. One of those tools, Traceview, showed a relatively high number of calls to the Long.valueOf() function, which led to objects accumulating in memory and causing the app to stall. This post describes the problem, the potential solutions we weighed, and the set of optimizations we made to improve the Feed platform.

為了幫助我們實現這一點，我們使用了幾個自動工具，在不同的設備和不同的情景下，測試動態列表滑動時候的記憶體使用情況、幀率等。這些測試工具中Traceview顯示Long.valueof()方法的調用次數較多，使記憶體空閑對象迅速增加，從而影響App的運行速度。在這篇文章中，我們介紹了問題、潛在解決方案、以及在動態列表上我們做的優化。

The downside of convenience （便利化的缺陷）

After noticing the high number of calls to the Long.valueOf() function in one of our method profiling reports from Traceview, we ran further tests that confirmed that as we scrolled through News Feed, there were an unexpectedly high number of invocations for this method.

通過分析Traceview的運行報告，在註意到Long.valueof()方法被多次調用後，我們進行了進一步的測試，確認當我們滑動新聞動態時，這個方法的調用數量出乎意料地高。

這裡寫圖片描述

When we looked at the stacktrace, we found that the function was not being called explicitly from Facebook's code but implicitly by code inserted by the compiler. This function was called when assigning(分派、設定) a primitive long value where a Long object is expected. Java supports both Object and primitive representation of the simple types (e.g., integer, long character) and provides a way to seamlessly convert between them. This feature is called autoboxing, because it “boxes” a primitive type into a corresponding Object type. While that's a convenient development feature, it creates new objects unbeknownst to the developer.

當我們查看stacktrace時，我們發現在Facebook的代碼並沒有直接調用該方法，而是由編譯器插入的代碼隱式調用的。這個函數在long轉化為Long時被調用。Java提供了“基本數據類型”(例如: int、long)與繼承自Object數據類型（例如: Integer、Long）轉化的支持。這個特性被稱為 autoboxing，它將“基本數據類型”轉化為相應的對象類型。這是一個方便的開發特性，在開發者無感知的情況下，為開發者創建了一個新的對象。

And these objects add up.
這裡寫圖片描述

In this heap dump taken from a sample app, Long objects have a noticeable presence; while each object is not big by itself, there are so many that they occupy(占據) a large portion( 部分) of the app's memory in the heap. This is especially problematic(問題的) for devices running the Dalvik runtime. Unlike ART, which is the newer Android runtime environment, Dalvik doesn't have a generational garbage collection, known to be more optimal for handling many small objects. As we scroll through News Feed and the number of objects grows, the garbage collection will cause the app to pause and sweep(打掃) unused objects from the memory. The more objects that accumulate(積攢), the more frequently( 頻繁地) the garbage collector will have to pause the app, causing it to stutter(結巴) and stall and making for a poor user experience.

截圖取自一個示例App，其中Long有一個明顯的調用，從圖中可以看出，每個對象本身並不大，卻占用了App堆記憶體的很大一部分空間。這肯定是有問題的。不像Android runtime(ART虛擬機)環境，Dalvik虛擬機在處理許多小對象時，並沒有分代垃圾收集機制。當我們滾動 “新聞動態” 時，對象的數量迅速增長，垃圾回收機制會導致應用程式暫停並清理沒有引用的對象的記憶體。對象積累越多，垃圾回收的次數就越頻繁。進行垃圾回收時，Dalvik將暫停應用程式，導致App卡頓，造成一個很差的用戶體驗。

Luckily, tools such as Traceview and Allocation Tracker can help find where these calls are made. After reviewing the origins of these autoboxing occurrences, we found that the majority of them were done while inserting long values into a HashSet data structure. (We use this data structure to store hash values of News Feed stories, and later to check if a certain story's hash is already in the Set.) HashSet provides quick access to its items, an important feature allowing interaction with it as the user is scrolling through News Feed. Since the hash is calculated and stored into a primitive long variable, and our HashSet works only with objects, we get the inevitable(不可避免的) autoboxing when calling setStories.put(lStoryHash).

幸運的是,Traceview和Allocation Tracker這些工具可以幫助我們查找這些調用。回顧這些autoboxing轉化，我們發現大部分的autoboxing轉化發生在long插入到HashSet<Long>時。(我們用HashSet<Long>這個數據結構存儲“新聞動態”數據，並通過hash查找某一個“新聞動態”數據是否在此數據結構中)。當用戶滑動“新聞動態”列表時，HashSet提供快速快速的數據查找能力。HashSet只能存儲繼承自Object的對象，並不能存儲“基本數據類型”，而我們用於計算和存儲的對象是一個long型的變數，當進行setStories.put(lStoryHash)操作時，不可避免的發生long到Long的轉化。

As a solution, a Set implementation( 實現) for primitive types can be used, but that turned out not to be as straightforward(簡單的) as we expected.

其中一個解決方案，可以實現一種支持"“基本數據類型”數據類型"的集合,但這種解決方案並不像我們預期的那麼簡單。

Existing solutions (解決方案)

There are a few existing Java libraries that provide a Set implementation for primitive types. Almost all of these libraries were created more than 10 years ago, when the only Java running on mobile devices was J2ME. So, to determine viability, we needed to test them under Dalvik/ART and ensure they could perform well on more constrained mobile devices. We created a small testing framework to help compare these libraries with the existing HashSet and with one another. The results showed that a few of these libraries had a faster runtime than HashSet, and with fewer Long objects, but they still internally allocated a lot of objects. As an example, TLongHashSet, part of the Trove library, allocated about 2 MB worth of objects when tested with 1,000 items:

有一些現有的Java庫,為"“基本數據類型”提供Set實現。幾乎所有這些庫創建十多年前，當時在移動設備上使用Java的設備只有J2ME。因此，為了確定是否可行，我們需要在Dalvik/ART上進行測試，確保他們可以運行在更多的移動設備上。我們構建了一個小測試框架來，來實現這些現有庫與HashSet進行比較。結果表明，其中一些庫相比於HashSet運行速度更快，創建的Long對象更少，但他們仍然在記憶體中分配了大量的對象。例如，Trove庫中的TLongHashSet，1000項Items分配了約2 MB記憶體控制項的Object對象。

這裡寫圖片描述

Testing other libraries, such as PCJ and Colt, showed similar results.

測試其他的庫，例如 PCJ 和Colt得到與之相似的結果。

It was possible that the existing solutions didn't fit our needs. We considered whether we could instead create a new Set implementation and optimize it for Android. Looking inside Java's HashSet, there's a relatively simple implementation using a single HashMap to do the heavy lifting.

現有的解決方案不符合我們的需要。我們考慮是否可以創建一個對Android優化的Set實現。查看Java HashSet源碼實現，發現其實現相對簡單，內部採用HashMap來實現。

public class HashSet<E> extends AbstractSet<E> implements Set<E>, ... {
    
    transient HashMap<E, HashSet<E>> backingMap;    
    
    ...

    @Override public boolean add(E object) {
        return backingMap.put(object, this) == null;    
    }

    @Override public boolean contains(Object object) {
        return backingMap.containsKey(object);    
    }
        
    ...        
}

Adding a new item to the HashSet means adding it to the internal HashMap where the object is the key and the HashSet's instance is the value. To check object membership, HashSet checks whether its internal HashMap contains the object as a key. An alternative to HashSet could be implemented using an Android-optimized map and the same principles.

添加一個Item到HashSet意味著將此Item添加到HashMap中，在HashMap中，該Item對象做為Key存在；在HashSet中查找該Item對象時，實際是在對比HashMap 中的Key中查找是否存在該Item對象。在Android平臺上的數據交換同樣使用的這套準則。

Introducing(介紹) `LongArraySet`

You may already be familiar with LongSparseArray, a class in the Android Support Library that acts as a map using a primitive long as a key. Example usage:

您可能已經熟悉LongSparseArray，其存在於Android的Support Library庫中，做為一個Map實現，其key為long型數據變數。使用示例:

LongSparseArray<String> longSparseArray = new LongSparseArray<>();
longSparseArray.put(3L, "Data");
String data = longSparseArray.get(3L); // the value of data is "Data"

LongSparseArray works differently than HashMap, though. When calling mapHashmap.get(KEY5), this is how the value is found in the HashMap:

LongSparseArray與HashMap工作原理不同。當調用mapHashmap.get(KEY5)時，以下為HashMap查找原理:

這裡寫圖片描述

When retrieving a value using a key on a HashMap, it's accessing the value in the array using a hash of the key as the index, a direct access in O(1). Making the same call on a LongSparseArray looks like this:

在HashMap中檢索一個值時，通過其對應key的hash值進行比較，直接檢索時時間複雜度為O(1)。LongSparseArray 做同樣操作時:

這裡寫圖片描述

LongSparseArray searches a sorted keys array for the key's value using a binary search, an operation with runtime of O(log N). The index of the key in the array is later used to find the value in the values array.

LongSparseArray搜索某一個key時，採用二分查找演算法查找存放key的數組，時間複雜度O(log N)。查找到的key的index用於在Value數組中找到對應的value。

HashMap allocates one big array in an effort to avoid collisions, which are causing the search to be slower. LongSparseArray allocates two small arrays, making its memory footprint smaller. But to support its search algorithm, LongSparseArray needs to allocate its internal arrays in consecutive memory blocks. Adding more items will require allocating new arrays when there's no more space in the current ones. The way LongSparseArray works makes it less ideal when holding more than 1,000 items, where these differences have a more significant impact on performance. (You can learn more about LongSparseArray in the official documentation and by watching this short video by Google.)

為了避免數據衝突HashMap創建了一個很大的數組，從而導致搜索速度較慢。LongSparseArray分配兩個較小的數組，使其記憶體占用較小。但是為了支持其搜索演算法,LongSparseArray需要分配連續的記憶體塊。添加更多的Item時，當數組空間不足時，需要分配新的數組空間。LongSparseArray的工作方式，使得Item數量超過1000項時，其表現不是很理想，這些差異對性能有更重大的影響。(想瞭解更多關於LongSparseArray的內容，可以閱讀official documentation和谷歌提供的short video ) 。

Since the LongSparseArray's keys are of a primitive long type, we can create a data structure with the same approach as HashSet but with a LongSparseArray as the internal map instead of HashMap.

由於LongSparseArray使用“基本數據類型”long做為key，我們可以用LongSparseArray替換HashSet中的HashMap，從而創建一個與HashSet相似的數據結構。

And so LongArraySet was created.

因此LongArraySet被創建出來了。

The new data structure looked promising, but the first rule of optimization is “always measure.” By using the same testing framework from earlier, we compared the new data structure with HashSet. Each data structure was tested by adding X number of items, checking the existence of each item, and later removing all of them. We ran tests using different numbers of items (X=10, X=100, X=1,000 ...) and averaged the time it took to complete each operation per item.

新的數據結構看起來是很有前景的，但是最優的解決方案總是在不斷的權衡的。通過之間介紹的小測試程式，我們把新的數據結構與HashSet進行了對比，對每一種數據結構進行添加X條數據、檢測Item數據存在情況和移除數據測試。我們測試使用不同數量的Item(X = 10、X = 100、X = 1000…)，並計算對應操作的平均時間。

The runtime results (time shown is in nanoseconds):
測試結果(單位：納秒)

這裡寫圖片描述

We saw a runtime improvement for the contains and remove methods using the new data structure. Also, as the number of items increased in the array set, it took more time to add new items. That's consistent with what we already knew about LongSparseArray — it doesn't perform as well as HashMap when the number of items is more than 1,000. In our use cases, we're dealing with only hundreds of items, so this is a trade-off we're willing to make.

我們看到使用新的數據結構後，contains與remove操作性能的顯著提升。同時，隨著item數量的增加，需要花更多的時間去添加一個新的item。這符合我們對於 LongSparseArray中item超過1000後的操作表現預期。在我們的案例中，我們僅僅要處理數百條的數據，因此經過權衡，我們將採用這種方案。

We also saw a big improvement related to memory. In reviewing the heap dumps and Allocation Tracker reports, we noticed a decrease in object allocations. Here's a side-by-side allocation report for the HashSet and LongArraySet implementations, when adding 1,000 items for 20 iterations:

我們還看到一個記憶體的重大提升。回顧heap dumps和Allocation Tracker報告，我們註意到對象分配的減少。以下添加1000條數據時，HashSet和LongArraySet的記憶體分配報告。

這裡寫圖片描述

In addition to avoiding all the Long object allocations, LongSparseArray was more memory-efficient internally, with about 30 percent fewer allocations in this scenario.

避免了Long對象占用記憶體，在這種情景下，LongSparseArray減少了30%的記憶體占用。

Conclusion 結論

By understanding how other data structures work, we were able to create a more optimized data structure for our needs. The less the garbage collector has to work, the lower the likelihood of dropped frames. Using the new LongArraySet class, and a similar IntArraySet for the primitive int data type, we were able to cut down a significant number of allocations in our entire app.

通過瞭解數據結構的工作原理，我們可以創建一個滿足我們要求的更優的數據結構。垃圾回收的次數越少，對幀率的影響就越小。使用LongArraySet和IntArraySet ，我們可以減少整個應用程式中，對象的記憶體分配。

This case study demonstrates the importance of measuring our options to ensure that we were introducing an improvement. We also accepted that while this solution is not perfect for all use cases, since this implementation is slower for very large data sets, it was an acceptable trade-off for us to optimize our code.

這個學習案例展示了，為了確保我們引入改進的可行，不斷權衡的重要性。對於這個方案並不適用於所有的情況，這點我們是可以接收的，雖然當數據量很大時，這個方案的效率是很差的，但是這是一個我們可以接受的折衷優化方案。

You can find the source code for the two data structures here. We are excited to continue working on challenges and optimizing our Feed platform and to share our solutions with the community.

從這裡可以下載LongArraySet 與IntArraySet的代碼。

如果打不開，可以從以下連接獲取源碼：
http://blog.csdn.net/xiaxl/article/details/72730959

本人英語水平較爛，如有翻譯錯誤之處，請指出，我會積極修正，謝謝大家

[譯]facebook Android平臺上動態列表的記憶體優化(Memory optimization for feeds on Android)

讀後感

下邊是原文及翻譯

The downside of convenience （便利化的缺陷）

Existing solutions (解決方案)

Introducing(介紹) LongArraySet

Conclusion 結論

========== THE END ==========

Introducing(介紹) `LongArraySet`