se.uu.it.mare

MaRe

Related Doc: package mare

class MaRe[T] extends Serializable

MaRe API.

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. MaRe
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new MaRe(rdd: RDD[T])(implicit arg0: ClassTag[T])

    rdd

    input RDD

Value Members

  1. final def !=(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  5. def cache: MaRe[T]

    Caches the underlying RDD in memory.

    Caches the underlying RDD in memory.

    returns

    new MaRe object

  6. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. def collectReduce(inputMountPoint: MountPoint[T], outputMountPoint: MountPoint[T], imageName: String, command: String, localOutPath: String, forcePull: Boolean = false, intermediateStorageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK): Unit

    :: Experimental :: First collects the data locally on disk, and then reduces and writes it to a local output path using a Docker container command.

    :: Experimental :: First collects the data locally on disk, and then reduces and writes it to a local output path using a Docker container command. This is an experimental feature (use at your own risk).

    inputMountPoint

    mount point for the partitions that is passed to the containers

    outputMountPoint

    mount point where the processed partition is read back to Spark

    imageName

    Docker image name

    command

    Docker command

    localOutPath

    local output path

    forcePull

    if set to true the Docker image will be pulled even if present locally

    intermediateStorageLevel

    intermediate results storage level (default: MEMORY_AND_DISK)

    Annotations
    @Experimental()
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  12. def getNumPartitions: Int

    Returns the number of partitions of the underlying RDD.

    Returns the number of partitions of the underlying RDD.

    returns

    number of partitions of the underlying RDD

  13. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  14. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  15. lazy val log: Logger

    Attributes
    protected
  16. def map[U](inputMountPoint: MountPoint[T], outputMountPoint: MountPoint[U], imageName: String, command: String, forcePull: Boolean = false)(implicit arg0: ClassTag[U]): MaRe[U]

    Maps each RDD partition through a Docker container command.

    Maps each RDD partition through a Docker container command.

    inputMountPoint

    mount point for the partitions that is passed to the containers

    outputMountPoint

    mount point where the processed partition is read back to Spark

    imageName

    Docker image name

    command

    Docker command

    forcePull

    if set to true the Docker image will be pulled even if present locally

    returns

    new MaRe object

  17. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  18. final def notify(): Unit

    Definition Classes
    AnyRef
  19. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  20. val rdd: RDD[T]

    input RDD

  21. def reduce(inputMountPoint: MountPoint[T], outputMountPoint: MountPoint[T], imageName: String, command: String, depth: Int = 2, forcePull: Boolean = false): MaRe[T]

    Reduces the data to a single partition using a Docker container command.

    Reduces the data to a single partition using a Docker container command. The command is applied using a tree reduce strategy.

    inputMountPoint

    mount point for the partitions that is passed to the containers

    outputMountPoint

    mount point where the processed partition is read back to Spark

    imageName

    Docker image name

    command

    Docker command

    depth

    depth of the reduce tree (default: 2, must be greater than or equal to 2)

    forcePull

    if set to true the Docker image will be pulled even if present locally

    returns

    new MaRe object

  22. def repartition(numPartitions: Int): MaRe[T]

    Repartitions the underlying RDD to the specified number of partitions.

    Repartitions the underlying RDD to the specified number of partitions.

    numPartitions

    number of partitions for the underlying RDD

    returns

    new MaRe object

  23. def repartitionBy(keyBy: (T) ⇒ Any, numPartitions: Int): MaRe[T]

    Repartitions data according to keyBy and org.apache.spark.HashPartitioner.

    Repartitions data according to keyBy and org.apache.spark.HashPartitioner.

    keyBy

    given a record computes a key

    numPartitions

    number of partitions for the resulting RDD

  24. def repartitionBy(keyBy: (T) ⇒ Any, partitioner: Partitioner): MaRe[T]

    Repartitions data according to keyBy and a custom partitioner.

    Repartitions data according to keyBy and a custom partitioner.

    keyBy

    given a record computes a key

    partitioner

    custom partitioner

  25. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  26. def toString(): String

    Definition Classes
    AnyRef → Any
  27. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped