Class AvroSchemaUtils

java.lang.Object
com.linkedin.venice.utils.AvroSchemaUtils

public class AvroSchemaUtils extends Object
  • Method Details

    • filterCanonicalizedSchemas

      public static List<SchemaEntry> filterCanonicalizedSchemas(SchemaEntry referenceSchema, Collection<SchemaEntry> schemas)
      Filter the given schemas using the referenceSchema and AvroCompatibilityHelper. The helper compares the canonicalized version of the schemas which means some differences are ignored when comparing two schemas. Specifically things docs and at the time of writing, default values (which is a bug).
      Parameters:
      referenceSchema - used to find matching schema(s).
      schemas - to be filtered.
      Returns:
    • validateAvroSchemaStr

      public static void validateAvroSchemaStr(String str)
      It verifies that the schema's union field default value must be same type as the first field. From https://avro.apache.org/docs/current/spec.html#Unions (Note that when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union. Thus, for unions containing "null", the "null" is usually listed first, since the default value of such unions is typically null.)
      Parameters:
      str -
    • isValidAvroSchema

      public static boolean isValidAvroSchema(org.apache.avro.Schema schema)
    • validateAvroSchemaStr

      public static void validateAvroSchemaStr(org.apache.avro.Schema schema)
    • filterSchemas

      public static List<SchemaEntry> filterSchemas(SchemaEntry referenceSchema, Collection<SchemaEntry> schemas)
      Filter the given schemas using the referenceSchema and the underlying Schema.equals method.
      Parameters:
      referenceSchema -
      schemas -
      Returns:
    • schemaResolveHasErrors

      public static boolean schemaResolveHasErrors(org.apache.avro.Schema writerSchema, org.apache.avro.Schema readerSchema) throws IOException
      Preemptive check to see if the given writer and reader schema can be resolved without errors.
      Parameters:
      writerSchema - is the schema used when serializing the object.
      readerSchema - is the schema used when deserializing the object.
      Returns:
      boolean that indicated if there were errors.
      Throws:
      IOException
    • generateSchemaWithNamespace

      public static org.apache.avro.Schema generateSchemaWithNamespace(String schemaStr, String namespace) throws IOException
      Generate a new schema based on the provided schema string with the namespace specified by .
      Parameters:
      schemaStr - is the original string of the writer schema. This is because string -> avro schema -> string may not give back the original schema string.
      namespace - the desired namespace for the generated schema.
      Returns:
      a new Schema with the specified namespace.
      Throws:
      IOException
    • compareSchemaIgnoreFieldOrder

      public static boolean compareSchemaIgnoreFieldOrder(org.apache.avro.Schema s1, org.apache.avro.Schema s2)
      Compares two schema with possible re-ordering of the fields. Otherwise, If compares every field at every level.
      Parameters:
      s1 -
      s2 -
      Returns:
      true is the schemas are same with possible reordered fields.
    • getFieldDefault

      @Nullable public static Object getFieldDefault(org.apache.avro.Schema.Field field)
      Non-throwing flavor of AvroCompatibilityHelper.getGenericDefaultValue, returns null if there is no default.
    • generateSupersetSchemaFromAllValueSchemas

      public static SchemaEntry generateSupersetSchemaFromAllValueSchemas(Collection<SchemaEntry> allValueSchemaEntries)
    • hasDocFieldChange

      public static boolean hasDocFieldChange(org.apache.avro.Schema s1, org.apache.avro.Schema s2)
      Given s1 and s2 returned SchemaEntry#equals(s1,s2) true, verify they have doc field change. It assumes rest of the fields are exactly same. DO NOT USE this to compare schemas.
      Parameters:
      s1 -
      s2 -
      Returns:
      true if s1 and s2 has differences in doc field when checked recursively. false if s1 and s2 are exactly same including the doc (but does not check for strict equality).
    • validateTopLevelFieldDefaultsValueRecordSchema

      public static void validateTopLevelFieldDefaultsValueRecordSchema(org.apache.avro.Schema valueRecordSchema)
    • isNullableUnionPair

      public static boolean isNullableUnionPair(org.apache.avro.Schema unionSchema)
      Parameters:
      unionSchema -
      Returns:
      True iif the schema is of type UNION and it has 2 fields and one of them is NULL.
    • createFlattenedUnionSchema

      public static org.apache.avro.Schema createFlattenedUnionSchema(List<org.apache.avro.Schema> schemasInUnion)
    • createGenericRecord

      public static org.apache.avro.generic.GenericRecord createGenericRecord(org.apache.avro.Schema originalSchema)
      Create a GenericRecord from a given schema. The created record has default values set on all fields. Note that all fields in the given schema must have default values. Otherwise, an exception is thrown.
    • containsOnlyOneCollection

      public static void containsOnlyOneCollection(org.apache.avro.Schema unionSchema)
      Utility function that checks to make sure that given a union schema, there only exists 1 collection type amongst the provided types. Multiple collections will make the result of the flattened write compute schema lead to ambiguous behavior
      Parameters:
      unionSchema - a union schema to validate.
      Throws:
      VeniceException - When the unionSchema contains more then one collection type
    • isUnresolvedUnionExceptionAvailable

      public static boolean isUnresolvedUnionExceptionAvailable()
      Returns:
      true if UnresolvedUnionException is available in the Avro version on the classpath, or false otherwise