Internals of kMeansTaim - 2024.1 English

Vitis Libraries

Release Date
2024-08-06
Version
2024.1 English

This document describes the structure and execution of kMeansTrain, implemented as a kMeansPredict function.

kMeansTrain fits new centers based native k-means using the existing samples and initial centers you provide. To achieve to accelertion training, DV elements in a sample are input at the same time and used for computing distance with KU centers and updating centers. The static configures are set by template parameters and dynamic by arguments of the API in which dynamic ones should not greater than static ones.

There are applicable conditions:

  1. Dim*Kcluster should be less than a fixed value. For example, Dim*Kcluster<=1024*1024 for centers with float stored in URAM and 1024*512 for double on an AMD Alveo™ U250.
  2. KU and DV should be configured properly due to limitation to URAM. For example, KU*DV=128 when centers are stored in URAM on an Alveo U250.
  3. The dynamic confugures should close to static ones to void unuseful computing inside.

Caution

These Applicable conditions.

Benchmark

The following results are based on:
  1. Dataset from UCI;
    1. https://archive.ics.uci.edu/dataset/371/nips+conference+papers+1987+2015
    2. http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
    3. http://archive.ics.uci.edu/ml/datasets/US+Census+Data+%281990%29
  2. All data as double are processed;
  3. Unroll factors DV=8 and KU=16;
  4. Results compared to Spark 2.4.4 and initial centers from Spark to ensure the same input;
  5. Spark 2.4.4 is deployed in a server which has 56 processers(Intel® Xeon® CPU E5-2690 v4 @ 2.60 GHz)