Skip to content

Create and update mapping domain-object will be different for MultiFieldMapping #873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tasteful opened this issue Aug 13, 2014 · 6 comments

Comments

@Tasteful
Copy link
Contributor

When using multifield mapping the domain-object will not be the same as before posted to server.

I adding a field with 3 variants (default + 2 not default)

{
  "mappings": {
    "da4d54ec38b0413c96bf94f3b4280265": {
      "properties": {
        "name": {
          "type": "multi_field",
          "fields": {
            "name": {
              "type": "string",
              "store": true,
              "index": "not_analyzed",
              "term_vector": "no"
            },
            "name_analyzed": {
              "type": "string",
              "store": false,
              "index": "analyzed",
              "term_vector": "no"
            },
            "name_notanalyzed": {
              "type": "string",
              "store": false,
              "index": "not_analyzed",
              "term_vector": "no"
            }
          }
        }
      }
    }
  }
}

When I request to get the mapping from ES I will got a JSON like this back where the type are "string" and the multi-field that have the same name as the field are exposed directly on the field-node.

{
  "e607244a-e95d-411e-9f7c-f78fd10ec3da" : {
    "mappings" : {
      "da4d54ec38b0413c96bf94f3b4280265" : {
        "properties" : {
          "name" : {
            "type" : "string",
            "index" : "not_analyzed",
            "store" : true,
            "fields" : {
              "name_notanalyzed" : {
                "type" : "string",
                "index" : "not_analyzed"
              },
              "name_analyzed" : {
                "type" : "string"
              }
            }
          }
        }
      }
    }
  }
}

When this is parsed back to domain-object the response-converter are creating a MultiFieldMapping that have type as "string" and correct name but other attributes like index/store/etc on the field are missing.

When I later update the mappings I got the following request (and missing the index/store attributes). Don't know if it have any negative impact that the attributes are missing.

{
  "elasticsearchprojects": {
    "properties": {
      "name": {
        "type": "string",
        "fields": {
          "name_notanalyzed": {
            "type": "string",
            "index": "not_analyzed"
          },
          "name_analyzed": {
            "type": "string"
          }
        }
      }
    }
  }
}

This is not consequent between how the domain-object are used when creating the mapping and after fetching the mapping. Needing 2 different code to handle the mappings, one for creating and one for validating and updating mapping.

@Tasteful
Copy link
Contributor Author

Is there any reason why ElasticTypeConverter.GetTypeFromJObject are creating a MultiFieldMapping and change the type to originalType instead of parsing the json into the domain objects that are created for multi field mapping?

@Tasteful
Copy link
Contributor Author

If we read the example about how to map multi fields (example from http://nest.azurewebsites.net/nest/indices/put-mapping.html it saying)

var typeMapping = new TypeMapping(Guid.NewGuid().ToString("n"));
var property = new TypeMappingProperty
{
    Type = "multi_field"
};

var primaryField = new TypeMappingProperty
{
    Type = "string", 
    Index = "not_analyzed"
};

var analyzedField = new TypeMappingProperty
{
    Type = "string", 
    Index = "analyzed"
};

property.Fields = new Dictionary<string, TypeMappingProperty>();
property.Fields.Add("name", primaryField);
property.Fields.Add("name_analyzed", analyzedField);

typeMapping.Properties.Add("name", property);

When we later check the property.Type == "multi_field", when reading deserializing the response from ES the property.Type == "string" but still the property-object is of type MultiFieldMapping.

Either the type should be parsed as

  1. "multi_field" and add the default version into the fields-collection as it was created from above example
  2. actual type from json (string, integer etc) and add a new property for the Field-collections on each mapping-type and also depricate the current MultiFieldMapping.

Both of the way I found will include breaking changes, either in the code that are used for creating the mapping or in the code that are reading the mapping.

@drewr or @Mpdreamz do you have any other suggestions on how to solve this or solve the problem with the breaking change 1 or 2?

I have already started with solution 1 but then I found out all the breaking changes and need a decision about how to proceed.

@gmarz
Copy link
Contributor

gmarz commented Aug 15, 2014

@Tasteful, to answer your original question, index, store, and term_vector aren't missing. You're just passing the default values, and the mapping will only reflect options that are different from the defaults. term_vector defaults to no, index defaults to analyzed, and store defaults to false, so there is no need to explicitly set those values.

As for property.Type == "string", since Elasticsearch 1.0, multi_field isn't stored as the type anymore. You can see this from the response if you issue a GET mapping request from ES. Instead, the primary field is set to its actual type (in this case string), and if has fields, then it's implied that it's a multi field. That's why in ElasticTypeConverter.CreateTypeFromJObject we look for fields and set it to multi_field temporarily, just so we can deserialize it as a multi field and get all the field properties, but put back the original type.

Hope that makes sense?

@Tasteful
Copy link
Contributor Author

@gmarz but the MultiFieldMapping does't have any properties for the ìndex, store and ´term_vectoror thenull_value` for the numeric fields (and a lot of other properties), this will make it impossible to run GetMapping for the index and run PutMapping with the same data (or if some other data is added). This operation are throwing exception, see test in PR #882.

If the multi_field not are sent back as the type I suggest that the field not should be deserialized as a MultiFieldMapping and add the Fields-collection on the regular core_types that are supporting multi_field.

@gmarz
Copy link
Contributor

gmarz commented Aug 15, 2014

@Tasteful, OK I see your point here.

Agreed, we should add Fields to the core types. This is going to be a breaking change no matter what and will have to be made in the 2.0 branch, so no need to even deprecate MultiFieldMapping- we can just remove it.

For 1.1, MultiFieldMapping obviously needs to stay for backwards compatibility reasons, but we should be able to extend it and add all of the needed properties.

@Tasteful
Copy link
Contributor Author

During the evening I was removeing all the MultiFieldMappers (for the big breaking change), don't know if I should do a pull-request on that already... you will find my branch here https://github.com/Tasteful/elasticsearch-net/tree/issue/873-2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants